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ABSTRACT 


The  automatic  solution  of  differential  equations  may  be  accomplished 
by  either  modeling  the  equation  on  an  analog  computer  or  by  solving  it 
numerically  on  a  general-purpose  computer.  Both  methods  are  cumbersome 
and  have  the  disadvantages  of  low  accuracy  and  slow  speed,  respectively. 
The  development  of  the  digital  differential  analyzer  promised  a  machine 
with  improved  accuracy  and  speed.  The  difficulty  in  programming  and  the 
reliance  on  complex  switching  networks  or  patch  boards  brought  about  by 
ever-increasing  parallelism,  however,  have  prevented  the  full  exploitation 
of  the  DDA  capabilities. 

A  modular  machine  structure  employing  serial-parallel  processing  and 
using  incremental  integration  as  its  basic  algorithm  has  been  developed. 
The  system  consists  of  self-contained  modules  which  may  be  operated  inde¬ 
pendently  or  may  be  combined  to  solve  numerically  one  or  more  differen¬ 
tial  equations.  Modularity  and  serial-parallel  processing  simplify  the 
communication  methods  within  and  between  modules  to  permit  automatic  pro¬ 
gramming;  the  hardware  requirements  are  reduced  as  in  serial  processing, 
but  the  iteration  time  cannot  exceed  a  fixed  maximum  regardless  of  the 
problem. 

To  eliminate  some  of  the  masked  instabilities  inherent  in  circular 
number  systems,  a  two-loop  number  system  is  presented.  An  extension  of 
the  two-loop  system  leads  to  number  systems  with  a  hysteresis.  Except 
for  the  case  of  multi-bit  communication,  it  is  possible  to  predict  the 
outcome  of  the  integrating  cycle  sufficiently  to  permit  post-multiplica¬ 
tion  of  the  integral  increment  by  a  constant  or  a  variable  simultaneously 
with  the  integrating  cycle.  This  capability  considerably  reduces  the 
solution  time  and  required  hardware. 

Combining  the  machine  with  a  general-purpose  computer  allows  auto¬ 
matic  programming  and  scaling.  In  this  environment,  the  user-generated 
program  consists  only  of  the  differential  equations  entered  in  a  standard 
format,  declarations  of  dependent  and  independent  variables,  the  number 
of  coupled  equations  to  be  solved,  and  some  control  statements. 
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Chapter  I 


INTRODUCTION 


A .  Numerical  Solution  of  Differential  Equations 

The  quantitative  study  of  physical  systems  requires  the  expression 
of  the  system  characteristics  in  mathematical  form.  This  expression 
usually  results  in  some  differential  equation  which,  when  evaluated, 
shows  behavior  corresponding  to  that  of  the  original  system.  The  equa¬ 
tions  may  be  linear,  nonlinear,  or  partial  differential  equations. 

The  solution  of  differential  equations  requires  that  we  find  some 
function  y  =  y(x,C)  such  that  if  the  function  y  is  substituted  in 
the  differential  equation  [say,  dy/dx  =  f(y,x)]  the  result  is  an 
identity.  Since  the  function  y  can  be  found  analytically  only  in  a 
small  number  of  cases,  we  resort  to  numerical  methods  of  finding  the 
solution.  Numerical  solutions  require  the  complete  specification  of 
the  differential  equation  (initial  conditions  and  parameters)  and  there¬ 
fore  are  always  particular  solutions.  The  numerical  solution  may  be 
found  by  either  differentiating  or  integrating,  but  integration  is  em¬ 
ployed  almost  exclusively  because  differentiation  involves  the  genera¬ 
tion  of  the  difference  between  two  very  small  quantities  (which  ideally 
tend  toward  zero)  and  therefore  introduces  unnecessary  errors. 

Numerical  integration  is  achieved  by  replacing  the  integrand  with 
some  quadrature  formula  and  evaluating  this  over  the  required  interval 
of  the  independent  variable.  In  this  process  the  independent-variable 
interval  is  div'.ded  into  subintervals  which  are  usually  of  equal  lengths. 

Traditionally,  two  basic  methods  have  been  available  to  obtain  the 
numerical  solutions  of  differential  equations.  The  first  is  to  solve 
the  equation  on  a  general-purpose  computer,  using  such  numerical  tech¬ 
niques  as  the  modified  Euler  integration,  Adam's  trapezoidal  integration, 
or  summation  of  the  Taylor  series.  The  second  is  to  model  the  equation 
on  an  analog  computer.  Given  that  we  desire  high-solution  speeds  and 
accuracies,  neither  of  these  methods  is  ideal.  The  general-purpose  com¬ 
puter  is  often  too  slow,  and  the  analog  computer  simply  cannot  provide 
the  accuracy. 
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If,  in  the  system  under  study,  the  dependent  variables  vary  only 
with  respect  to  time  or  some  other  single  independent  variable,  we  have 
an  ordinary  differential  equation;  if,  on  the  other  hand,  the  dependent 
variables  vary  with  respect  to  two  or  more  independent  variables,  the 
equations  will  contain  partial  derivatives.  Because  in  the  analog  com¬ 
puter  all  integration  is  with  respect  to  time  only,  these  partial  differ¬ 
ential  equations  cannot  be  solved  directly.  The  use  of  the  "generalized 
integrator, "  which  includes  a  multiplier,  allows  in  effect  integration 
with  respect  to  a  variable  other  than  time,  but  the  multiplier  also  in¬ 
troduces  additional  errors  and  represents  additional  hardware  and  cost. 

Because  the  analog  computer  is  a  completely  parallel  machine  (it 
consists  of  many  processors  operating  simultaneously),  its  programs  must 
be  hard  wired  for  continuous  operation.  This  requires  the  use  of  plug 
boards . 

B.  Background 

The  earlif  development  of  the  digital  differential  analyzer  (DDA), 
which  is  essentially  the  digital  equivalent  of  the  analog  integrator,  al¬ 
lowed  the  modification  of  analog  computers.  Replacing  analog  integrators 
with  DDA  integrators  resulted  in  systems  capable  of  high-speed  solutions 
and  the  desired  high  accuracies;  in  addition,  the  independent  variable 
was  no  longer  restricted  to  time  as  in  the  analog  integrator. 

The  first  such  machine  to  be  built  was  the  MADDIDA  (Bartee  et  al, 
1962),  developed  in  1950.  It  was  considered  a  low-cost  device,  employ¬ 
ing  a  magnetic  drum  memory  to  allow  arbitrary  stored  interconnections 
such  that  any  DDA  integrator  could  be  connected  to  any  other  integrator, 
including  itself.  The  MADDIDA  used  binary  communication,  which  requires 
a  single  bit  and  restricts  both  the  independent  variable  input  and  the 
integral  output  increment  to  the  values  of  +1  and  -1. 

Since  1950,  the  need  for  higher  speed  and  accuracies  has  produced 
many  technological  improvements.  More  accurate  algorithms  were  intro¬ 
duced  (Yu,  1968;  Nilsen,  1968)  and,  to  increase  operating  speed,  subse¬ 
quent  systems  had  high  degrees  of  parallelism.  This  latter  trend  made 
it  practically  impossible  to  retain  stored  programs,  leading  to  the 
alternatives  of  single-purpose  computers  or  patch-board  programming. 


The  TRICE  (Transistorized  Real  Time  Incremental  Computer  -  Expandable), 
developed  by  Packard  Eell  Corporation  in  1958  (Mitchell,  Ruhman,  1958), 
was  such  a  machine  using  plug-board  programming  and  parallel  processing 
at  a  rate  of  100,000  iterations/sec. 

Past  and  present  DDAs  have  been  designed  essentially  in  the  manner 
of  analog  computers,  and  it  is  this  analog  approach  which,  in  my  opinion, 
has  prevented  the  development  of  a  truly  general  DDA  machine  which  can 
find  wide  acceptance.  Existing  DDAs  are  for  the  most  part  "one  applica¬ 
tion"  computers,  solving  the  same  equations  or  sets  of  equations  with 
different  initial  conditions  or  parameters.  They  are  used  for  naviga¬ 
tional  calculations,  for  computation  of  projectile  trajectories,  or  for 
high-o~der  control-system  equations.  Any  change  in  programming  involves 
either  hardware  modifications  (such  as  plug-board  reprogramming)  or  com¬ 
plex  time  and/or  space  multiplexing  schemes  to  effect  the  proper  inter¬ 
connection  of  the  various  integrators.  As  a  result,  these  methods  se¬ 
verely  limit  the  application  of  the  machines  because  they  require  either 
a  great  amount  of  time  and  skill  on  the  part  of  a  programmer  or  enormous 
amounts  of  multiplexing  hardware.  Such  disadvantages  have  acted  as 
strong  deterrents  to  the  full  exploitation  of  the  inherent  capabilities 
of  digital  incremental  integration. 

C.  Statement  of  the  Problem 

This  investigation  has  sought  a  new  approach  to  the  problem,  di¬ 
rected  toward  the  development  and  organization  of  a  special-purpose 
machine  to  solve  differential  equations  numerically.  The  goal  is  a 
high-speed  high-accuracy  system  that  will  be  compact,  adaptable,  and, 
above  all,  easy  to  use.  Although  the  proposed  system  has  not  yet  been 
constructed  as  hardware,  it  has  been  simulated  on  the  Stanford  Computa¬ 
tion  Center  IBM  360  Model  67. 

A  new  machine  structure,  the  digital  incremental  computer  (DIC), 
based  on  a  modular  concept,  is  proposed.  Each  module  is  a  separately 
self-contained  device  that  can  operate  independently  or  connected  to 
other  modules  on  one  or  several  problems  simultaneously.  Its  structure 
is  such  that  if  the  system  is  employed  in  conjunction  with  a  general 
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purpose  computer,  it  will  not  only  be  easy  to  use  but  in  fact  will 
require  substantial  effort  on  the  part  of  the  programmer  to  avoid 
using  it.  A  software  package,  developed  by  B.  Pt'rasuraman  (Schulz, 
Parasuraman,  1971),  will  be  employed  in  conjunction  with  the  DIC  to 
accept  the  problem  statement  virtually  in  the  form  that  differential 
equations  are  normally  written.  Additional  statements  required  are 
the  number  of  equations  to  be  solved,  a  declaration  of  the  dependent 
and  independent  variables,  and  specification  of  the  range  and  preci¬ 
sion  of  the  desired  solution.  The  software  package  will  generate  the 
program,  load  the  system,  and  store  the  solution  output  for  subsequent 
use  or  for  printout  and  display. 

The  system  employs  serial-parallel  processing  which,  although 
slower  than  total  parallel  processing,  does  not  allow  the  iteration 
time  to  exceed  the  time  required  to  process  all  integrals  in  one  mod¬ 
ule.  The  solution  time  of  equations  that  do  not  require  all  available 
modules  can  be  decreased  by  distributing  the  various  integral  functions 
over  several  modules.  Serial-parallel  processing  also  allows  total  com¬ 
munication  within  the  modules  and  restrictive  communication  between  mod¬ 
ules  without  the  necessity  of  resorting  to  patch  boards  or  extensive 
time  or  space  multiplexing.  Here,  'total  communication"  means  that  any 
integral  output  can  be  used  as  the  dependent  variable  input,  or  as  a 
component  thereof,  to  any  or  all  integrals;  this  output  also  can  be 
used  as  the  independent  variable  input  to  any  two  integrals. 

Other  innovations  are  the  two-loop  number  system  and  simultaneous 
integration  and  multiplication.  It  is  shown  that  the  two-loop  number 
system  eliminates  instabilities  and  oscillations  encountered  when  em¬ 
ploying  a  circular  number  system.  Increasing  the  size  of  the  two  loops 
while  maintaining  the  total  number-range  constant  results  in  number  sys¬ 
tems  containing  a  hysteresis. 

Simultaneous  integration  and  multiplication  can  decrease  the  total 
solution  time  by  one  half .  With  the  given  number  system,  it  is  possible 
to  make  a  partial  prediction  of  the  outcome  of  the  integration  at  the 
beginning  of  each  integrating  cycle.  This  prediction  is  sufficient  to 
allow  initiation  of  the  multiplication  process  of  the  output  by  either 
a  constant  or  a  function  and  to  complete  the  multiplication  before  the 
integral  output  is  generated . 


A  simplified  method  is  developed  that  allows  floating-point  arith¬ 
metic  yet  requires  the  storing  and  recalculating  of  only  a  single  expo¬ 
nent  for  each  integral.  Furthermore,  this  floating-point  method  does 
permit  simultaneous  integration  and  post-multiplication. 


D.  Approach 

Chapter  II  deals  with  the  principles  of  numerical  solutions  and, 
in  particular,  centers  on  solutions  using  digital  differential  analyz¬ 
ers.  This  and  the  consideration  of  the  basic  DDA  construction  parame¬ 
ters  introduce  the  proper  background  for  subsequent  chapters. 

Chapter  III  describes  the  concept  of  the  proposed  machine.  The 
design  goal  is  outlined,  and  the  necessary  requirements  to  meet  this 
goal  are  established.  In  Chapter  IV  several  number  systems  are  inves¬ 
tigated.  The  two-loop  number  system  is  introduced,  and  an  extension  of 
this  system  leads  to  a  system  of  overlapping  loops.  The  logical  imple¬ 
mentation  is  presented  for  both  the  circular  and  two-loop  number  systems, 
as  well  as  for  a  multi-bit  transfer  two-loop  system. 

Chapter  V  considers  the  conceptual  functional  block.  Several  inno¬ 
vations  such  as  pre-  and  post-multiplication  are  incorporated  into  the 
basic  block,  and  simultaneous  integration  and  post-multiplication  are 
introduced.  It  is  shown  that  the  outcome  of  the  integral  function  in 
single-bit  transfer  machines  can  be  predicted.  In  addition,  a  floating¬ 
point  arithmetic  method  is  introduced,  which  requires  the  storage  of  only 
a  single  exponent  for  the  total  functional  block. 

The  multi-module  system  with  serial-parallel  processing  is  presented 
in  Chapter  VI.  Two  basic  approaches,  "horizontal  communication"  and 
"vertical  communication,"  are  considered  for  inter-module  communication. 

Chapter  VII  discusses  the  module;  processing,  intra-module  communi¬ 
cation  methods,  and  the  memory  organization  are  examined.  Chapter  VIII 
outlines  the  proposed  machine.  Operating  procedures,  communication  meth¬ 
ods  within  and  between  modules,  and  the  externally  generated  function  in¬ 
puts  are  explained.  An  example  illustrates  the  programming.  Chapter  IX 
summarizes  the  work  and  presents  conclusions  and  suggestions  for  further 
s  tudy . 
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Chapter  II 

PRINCIPLES  OF  DIGITAL  DIFFERENTIAL  ANALYZERS 

A.  Principles  of  Numerical  Solution 

To  solve  numerically  for  the  integral  of  a  function  f(x),  the 
function  is  replaced  by  a  formula  that  approximates  f(x)  over  a  small 
interval  of  the  independent  variable  x,  and  the  result  is  integrated 
over  that  interval.  The  equations  employed  usually  require  knowledge 
of  the  previous  values  of  the  integral,  the  function,  and  some  of  its 
derivatives.  The  newly  calculated  value  of  the  integral  then  can  be 
used  as  a  factor  to  compute  other  functions  and  to  repeat  the  above 
process.  When  the  integral  has  been  formed  over  the  interval  for  which 
the  quadrature  formula  is  valid,  that  formula  must  be  updated  by  obtain¬ 
ing  new  values  for  the  function  and  its  derivatives.  These  methods  are 
well  described  in  the  literature  (Scarborough,  1966;  Cunningham,  1958). 

In  incremental  integration,  we  do  not  obtain  the  whole  integral 
but  only  the  change  of  the  integral  during  the  subinterval.  This  change 
then  is  transmitted  to  be  used  as  a  factor  to  calculate  other  functions, 
or  it  can  be  accumulated  to  yield  the  whole  integral  value. 

The  most  commonly  used  incremental  integrating  algorithms  are  rec¬ 
tangular  and  modified  trapezoidal. 

1 .  Rectangular  Integration 
If,  in  the  Taylor  series 

f(x  >  =  f(x  )  +  f  ’(x^  Ax,.  +  —  f"(xi)  Ax^ 

+  —  f ,,,(xi)  +  ...  (2.1) 

we  drop  all  ,  rms  on  the  right-hand  side  that  contain  powers  of  Ax 
greater  than  le,  we  have 

f(xi+1)  =  f(x  )  +  f'(xi)  (2.2) 

Preceding  page  blank 
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where 


f '(x.) 
1 


df (x. ) 
1 

dx 


Ax.  =  x. 
1  l 


x 


i-1 


Equation  (2.2)  represents  rectangular  integration  of  f’(x) 
with  respect  to  x.  Figure  1  is  the  graphical  representation  of  this 


» 


Fig.  1.  RECTANGULAR  INTEGRATION, 
integral  of  y  with  respect  to  x 


process.  The  •'rea  that  is 

bounded  by  the  curve  y,  the 

abscissa,  and  the  ordinates  at 

the  end  points  of  the  desired 

finite  interval  (x  ,x  )  is 
o  n 

divided  into  small  rectangles 

of  height  y^  and  width  AJ^  = 

x.  „  -x  .  If  n  is  made  to  be 
l+l  i 

very  large,  which  is  equivalent 
to  making  Ax  very  small,  and 
if  the  function  y  is  well  be¬ 
haved,  the  sum  of  these  rectan¬ 
gles  is  an  approximation  to  the 
sr  the  specified  interval: 


n 

£im  ^  Y.AX  =  f  ”  y  dx  (2.3) 

n  i=0  1  ~  0 


If  the  integral  /*n  y  dx  =  Z,  then  the  individual  are  the  in“ 

crements  AZ^  of  the  integral,  and  AZ^  =  Y^Z#^. 

The  digital  differential  analyzer  (DDA)  is  a  device  that  im¬ 
plements  this  incremental  integration.  Figure  2  illustrates  the  basic 
construction  of  a  DDA,  which  requires  two  registers  and  two  arithmetic 
units,  and  Fig.  3  is  its  schematic  symbol.  The  inputs  to  the  DDA  are 
the  dependent  and  independent  variable  increments  AY  and  AX,  re¬ 
spectively.  In  the  following,  all  variables  are  normalized  to  unity. 


WIUMWCWSPJ 


Fig.  2.  DIGITAL  INTEGRATOR  FOR  RECTANGUIAR  INTEGRA¬ 
TION. 


AX-«-j  R  \  AZ*  Y- AX 

Av-H  Y  / 


Fig.  3.  DIGITAL 
INTEGRATOR  SYMBOL. 


The  value  of  ££  is  restricted  to  +1,  -1,  and  0.  The  accumulation  of 
the  AY  increments  is  stored  in  the  Y  register  and  Y  is  added  to 
the  content  of  the  R  register  if  AX  is  positive  and  subtracted  if 
it  is  negative.  The  process  then  is  described  by 


Y 


i 


i 


+  Y 

o 


(2.4) 


and 


R 


i 


+  R 

o 


(2.5) 


whore  Y  and 
o 

respectively. 


Rq  are  the  initial  values  of  the  integrand  and  integral, 
These  equations  can  be  rewritten  as  difference  equations: 


Yi  “ 


Yi-1  + 


AY, 


(2.6) 
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and 


Ri  =  Ri-1  +  Yi^i  (2.7) 

where  R^  is  a  whole  word  and  is  the  summation  of  Y  with  respect  to 
X.  If  we  consider  the  maxi’”  .m  allowable  absolute  value  of  R  to  be  N, 
then,  whenever  j R^ (  exceeds  N,  an  overflow  or  underflow  occurs  which 
represents  the  output  AZ  of  the  DDA.  An  accumulation  of  all  AZ  in 
some  other  register  will  again  be  equal  to  the  summation  of  Y.AX.  with 
the  remainder  R^  in  the  R  register: 
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Because  the  inherent  delays  in  serial  processing  systems  are 
less  than  those  in  parallel  systems,  the  modified  trapezoidal  integra¬ 
tion  rule  is  not  a  proper  algorithm  for  serial  machines.  The  Rieman 
integrating  rule  should  be  used  (Monroe,  1962)  for  these  systems. 


B.  PDA  Solution  of  Differential  Equations 

A  single  DDA  solves  the  differential  equation  dz  =  ydx.  To  solve 
more  complex  equations,  several  integrators  must  be  interconnected  such 
that  the  resulting  circuit  models  the  equation.  The  basic  programming 
techniques  are  similar  to  those  used  for  analog  computers  (Sizer,  1968; 
Forbes,  1956)  and  are  well  known.  The  equation  to  be  solved  is  rewrit¬ 
ten  in  differential  form  with  the  highest  order  derivative  on  the  left- 
hand  side  and  all  other  terms  on  the  right-hand  side.  Using  the  highest 
order  differential  term  as  the  dependent-variable  increment  input,  it  is 
integrated  with  respect  to  the  independent  variable.  With  successive 
integration,  all  derivatives  of  the  function  can  be  found,  and  the  terms 
on  the  right-hand  side  of  the  equation  can  be  generated.  The  sum  of 
these  terms  is  equal  to  the  highest  order  differential  and  the  loop  is 
closed.  [Other  approaches  may  result  in  a  simpler  program  for  certain 
problems  (Yu,  1968)].  Some  examples  will  be  instructive  and  will  serve 
as  a  comparison  to  the  program  and  hardware  requirement  for  the  proposed 
system. 


C .  Examples 

1 .  Example  1 

We  will  consider  the  programming  of  Van  der  Pol's  equation 


d^v  ,  2.  dv 

d  t 


(2.11) 


with  the  initial  conditions  v  =1.5  and  v  =0.  The  maximum  vi*lue 

o  o 

of  v  and  v  will  be  less  than  2  if  |  is  near  unity.  Multiplying 
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this  equation  by  dt  and  moving  all  terms  but  the  highest  order  term 
to  the  right-hand  side  yields 


dv  =  |dv  -  |v  dv  -  vdt 


(2.12) 


where  dv  =  vdt.  Letting  dv  be  the  dependent-variable  input  to  the 
first  integrator  and  dt  its  independent-variable  input,  the  output 
becomes  vdt  =  dv.  Integrating  dv  with  respect  to  t  results  in 
vdt  as  the  output  of  integrator  2.  With  the  two  terms  dv  and  vdt, 
we  can  now  generate  dv  and  close  the  loop  as  shown  in  Fig.  4. 


Fig.  4.  SOLUTION  DIAGRAM  FOR  VAN  DER  POL’S  EQUATION. 

The  program  connection  is  now  determined.  To  complete  the 

program,  we  must  still  scale  all  variables;  several  methods  are  possible. 

One  approach  is  to  solve  a  set  of  algebraic  equations  for  each  integrator. 

A  single  integrator  is  illustrated  in  Fig.  5,  with  the  scaling  quantities 

M,  N,  X,  Y,  and  Z.  If  the  maximum  value 

M 

of  the  integrand  is  y  ,  then  2  >  y  . 

m  — •  m 

The  number  of  bits  used  in  the  integrand 

register  is  N;  X,  Y,  and  Z  are  the 

X 

exponents  of  2  such  that  there  are  2  , 

Y  Z 

2  ,  and  2  increments  for  each  unit  of 
the  independent-variable  input,  dependent-variable  input,  and  integral 
output,  respectively.  The  integrator  is  scaled  correctly  if 


M  \  j  y+N*M 

/  —  x  +  M  *  t, 

U _ / 


Fig.  5.  DIGITAL  INTEGRATOR 
SYMBOL  WITH  SCALING  PARAM¬ 
ETERS. 
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Y  +  N  =  M 


(2.13) 


and 


X  +  M  =  Z  (2.14) 

All  scale  factors  in  the  above  example  are  shown  in  Table  1.  In  this 
case,  the  maximum  number  of  bits  was  taken  to  be  16.  Note  that  the 
independent-variable  input  for  integrator  2  has  a  scale  factor  differ¬ 
ent  from  that  of  integrator  1.  This  situation  usually  arises  if  the 


Table  1 


d  "v  2  dv 

SCALING  FOR  — £  -(l-v)^-+v=0 
dt2  dt 


Integrator 

(No.) 

Function 

(y) 

ym 

M 

N 

X 

Y 

Z 

1 

V 

2 

1 

14 

-16 

-13 

-15 

2 

V 

1.5 

1 

16 

-14 

-15 

-13 

3 

i 

1 

2 

16 

-15 

— 

-13 

4 

V 

1.5 

1 

16 

-15 

-15 

-14 

5 

V 

1.5 

1 

16 

-14 

-15 

-13 

6 

-1 

0 

16 

-13 

— 

-13 

original  equation  is  magnitude  and  frequency  scaled  ii  a  fashion  similar 
to  equations  being  programmed  for  analog  computers.  Unless  automatic 
scaling  is  available,  this  method  is  very  often  the  easiest  and  simplest 
because  it  eliminates  the  need  to  solve  the  several  sets  of  algebraic 
equations  (Peterson,  1968),  If  the  machine  does  not  allow  tLe  use  of 
different  machine  times,  the  X  input  for  integrator  2  may  be  gener¬ 
ated  by  an  additional  integrator  with  a  constant  multiplication  factor 
and  M  =  2. 

2 .  Example  2 

A  second  example  is  the  equation  of  a  circle: 
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Table  2 


SCALING  FOR  y  . 

.2  Ia  \ 

d_L  +  *L\ 

a  2  \dx/ 

ax  '  ' 

2 

+  1 

=  0 

Integrator 

(No.) 

Function 

(y) 

M 

N 

X 

Y 

Z 

1 

• 

y 

1 

14 

-15 

-13 

-14 

2 

i/y 

0 

14 

-14 

-14 

-14 

3 

i/y 

0 

14 

-14 

-14 

-14 

4 

i/y 

2 

16 

-15 

-14 

-13 

5 

• 

y 

1 

14 

-14 

-13 

-13 

6 

y 

2 

16 

-15 

-14 

-13 

D.  Construction  Parameters 

DDAs  can  be  classified  by  three  basic  construction  parameters 
(Wood,  1965): 

(1)  parallel/ serial — input-output 

(2)  parallel/ serial — arithmetic 

(3)  parallel/ serial — processing  of  integrators 

Although  they  do  influence  system  performance,  parameters  (1)  and  (2) 
chiefly  represent  possible  trade-offs  between  solution  speed  and  hard¬ 
ware  on  a  fixed-ratio  basis.  For  example,  if  it  takes  clock  pulses 

to  process  an  integrator  with  parallel  arithmetic  and  it  takes  K2  clock 
pulses  to  process  one  integrator  of  the  same  bit  length  with  serial  arith 
metic,  then  for  a  given  machine  (serial  or  parallel)  the  solution  time  us 
ing  serial  arithmetic  will  be  Kg/K.^  multiplied  by  the  solution  time  us¬ 
ing  parallel  arithmetic  regardless  of  the  equation  being  solved.  The 
third  parameter,  however,  represents  trade-offs  of  variable  ratios  be¬ 
tween  solution  speed  and  hardware  complexity.  Parallel  processing  re¬ 
sults  in  the  highest  possible  solution  speed,  and  the  iteration  rate  for 
a  given  word  length  is  constant  regardless  of  the  equation  being  solved. 
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A  machine  that  has  a  fixed  number  of  integrators  can  solve  any  problem 
as  long  as  the  availability  of  integrators  is  not  exhausted.  For  a 
practically  sized  machine,  this  might  constitute  a  severe  limitation 
on  the  type  of  problems  that  can  be  solved. 

Programs  for  parallel-processing  machines  must  of  necessity  be 
hard  wired  and  the  machine,  therefore,  cannot  use  stored  programs. 

The  usual  programming  method  in  this  case  is  either  permanently  wiring 
for  a  machine  that  solves  only  one  problem  with  different  equation  pa¬ 
rameters  (or  initial  conditions)  or  plug-board  programming  as  used  on 
analog  computers. 

Plug-board  programming  requires  an  excessive  amount  of  time  compared 
to  the  solution  time.  Although  this  may  be  acceptable,  if  the  particular 
program  is  a  standard  program  to  be  used  many  times,  this  method  is  not 
feasible  in  a  situation  where  the  machine  is  to  be  employed  by  many  users 
with  different  problems.  In  addition,  wiring  errors  may  easily  occur, 
especially  in  problems  requiring  a  high  wiring  density. 

The  solution  time  for  serial  processing  DDAs  varies  directly  with 
the  complexity  of  the  problem  because  only  one  integrator  is  processed 
at  any  one  time.  This,  however,  considerably  simplifies  the  problem  of 
programming.  Only  one  pair  of  input  variables  and  one  output  variable 
must  be  generated  or  transmitted  with  stored  programs.  Limitations  on 
size  or  complexity  of  problems  that  can  be  solved  on  a  particular  machine 
are,  in  this  case,  set  by  the  size  of  the  available  program  storage.  The 
machine  requires  only  a  single  processor. 

When  programming  any  DDA,  the  programmer  must  be  able  to  manipulate 
the  differential  equation  to  be  solved  in  such  a  way  that  he  can  set  up 
a  solution  diagram.  This  often  requires  recognition  of  functions  as  so¬ 
lutions  to  differential  equations  which,  in  turn,  are  necessary  for  the 
overall  solution  of  the  given  problem.  This  requirement  on  the  program¬ 
mer  in  itself  restricts  the  practical  use  of  DDA  machines  to  a  relatively 
small  number  of  users.  One  of  the  most  important  considerations  in  this 
work,  therefore,  has  been  to  develop  a  machine  environment  that  would 
eliminate  the  most  difficult  and  time-consuming  programming  tasks. 


Chapter  III 


concept  of  the  proposed  machine 


A.  Requirements 

The  objective  of  this  work  was  to  develop  a  machine  that  would 
satisfy  the  following  goals.  (1)  It  is  to  operate  in  the  environment 
of  a  computer  center  mated  to  either  a  small  or  large  computer.  (2) 

It  must  be  capable  of  operating  with  user-generated  programs  that  spec¬ 
ify  little  more  than  the  equation  to  be  solved,  the  dependent  and  inde¬ 
pendent  variables,  and  the  accuracy  and  range  of  the  solution  desired. 
The  mapping  of  the  equation  and  the  generation  of  the  machine  program, 
therefore,  must  be  accomplished  automatically.  (3)  In  addition  to  act¬ 
ing  as  an  external  device  to  a  general-purpose  computer  (GPC),  it  must 
be  able  to  operate  independently  in  the  environment  of  control  systems; 
this  is  important  because  many  control  systems  are  complex  enough  to 
require  the  solution  of  differential  equations  but  do  not  warrant  the 
expenditure  of  a  large  high-speed  general-purpose  computer. 

The  basic  requirements  selected  were 


(1) 

high  accuracy 

(5) 

solution  reversibility 

(2) 

high  operating  speed 

(6) 

modularity 

(3) 

ease  of  programming 

(7) 

expandability 

(4) 

solution  repeatability 

(8) 

adaptability 

Their  import  in  configuring  the  DIC  is  discussed  in  the  following  sec¬ 
tions  . 


B.  Accuracy  and  Solution  Speed 

With  incremental  computation,  the  largest  errors  encountered  are 
the  result  of  quantization  and  truncation.  Typically,  the  worst  possi¬ 
ble  error  should  not  exceed  2  n  if  n  is  the  length  of  a  whole  word 
in  number  of  bits  (Mayorov,  1964).  In  rectangular  integration,  the  er¬ 
ror  e  for  a  monotonic  continuous  curve  is  given  by  €  <  (y  -  y  )  & 

no 


(Braun,  1963).  The  precision  of  the  computation  varies  directly  with 
the  length  of  the  word  and,  unlike  general-purpose  computers  with  bit- 
parallel  word  serial  processing,  the  solution  speed  is  inversely  propor¬ 
tional  to  the  word  length  and  therefore  to  precision.  Basically,  preci¬ 
sion  can  be  increased  without  limit  by  using  longer  words  at  the  expense 
of  more  time,  or  solution  speed  can  be  increased  by  sacrificing  precision. 
Both  alternatives  are  attractive  and  their  advantages  can  be  selectively 
exploited,  depending  on  the  application. 

As  noted  in  Chapter  II,  the  accuracy  of  calculations  can  be  improved 
by  using  higner  order  integrating  algorithms  such  as  extrapolating  trape¬ 
zoidal  integration.  Another  method  is  to  employ  multi-bit  increment 
transfers  to  reduce  the  errors  introduced  by  truncation  of  the  integral 
increment.  Nilsen  (1968)  has  shovn  that  multi-bit  increment  transfers 
permit  the  use  of  shorter  word  lengths  and  resultant  savings  in  solution 
time  without  the  normally  associated  penalty  of  loss  of  accuracy.  His 
method,  however,  does  introduce  the  restriction  that  integration  can  be 
accomplished  only  with  respect  to  time.  Other  problems  associated  with 
multi-bit  transfer  are  discussed  in  Chapter  IV. 

In  addition,  accuracy  can  be  improved  by  the  use  of  floating-point 
arithmetic.  Although  this  improvement  is  less  than  that  achieved  by 
multi-bit  transfer,  floating-point  arithmetic  does  have  the  additional 
benefit  of  considerably  reducing  and  possibly  eliminating  the  often  very 
difficult  problem  of  scaling. 

High  operating  speed  can  be  achieved  by  total  parallel  processing; 
the  result,  however,  would  be  incompatible  with  requirement  3.  Serial¬ 
processing  machines  operate  at  high  solution  speeds  for  simple  equations 
(equations  whose  solution  requires  only  a  few  integrations  per  iteration) 
but,  as  the  complexity  of  the  equation  increases,  the  solution  time  in¬ 
creases  proportionately  without  bound.  One  solution  to  this  dilemma  is 
a  machine  that  employs  serial-parallel  processing. 

C.  Ease  of  Programming 

The  usefulness  of  any  device  is  directly  related  to  the  ease  of  use 
by  the  programmer.  By  incorporating  provisions  that  allow  for  automatic 
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editing  and  programming  wherever  possible,  the  proposed  system  can  be 
employed  without  any  extra  effort  when  attached  to  a  GPC;  in  fact,  given 
a  DIC/GPC  system  with  a  built-in  translator  program  and  given  a  problem 
that  contains  differential  equations,  it  would  require  considerably  more 
effort  on  the  part  of  the  programmer  to  avoid  using  the  DIC. 

The  automatic -editing  capability  is  one  of  the  key  features  neces¬ 
sary  for  successful  operation  of  the  system  in  the  computer-center  en¬ 
vironment  . 


D.  Solution  Repeatability 

Repeatability  of  operations  or  calculations  is  necessary  to  ensure 
accuracy;  furthermore,  it  allows  for  some  fault  detection.  Because  the 
operating  parameters  and  the  initial  conditions  stored  in  the  DIC  are 
not  modified  or  destroyed  during  processing,  it  is  always  possible  to 
stop  at  any  point  and  repeat  the  solution  from  its  initial  value. 

In  control-system  applications,  a  given  equation  often  must  be 
solved  repetitively  with  only  a  few  changes  in  parameters,  and  only  these 
initial  conditions  or  parameters  must  be  entered  while  all  others  are 
retained.  A  similar  situation  occurs  when  searching  for  the  solution 
of  problems  with  given  initial  and  terminal  boundaries,  where  some  ini¬ 
tial  conditions  must  be  changed  until  the  proper  solution  is  found. 


E.  Solution  Reversibility 

The  DIC  is  capable  of  reversing  the  direction  of  computation;  there¬ 
fore,  we  can  stop  the  solution  at  some  point,  reverse,  and  retrace  it  to 
its  initial  value.  Given  the  function  value  at  some  point  in  time  t., 
we  can  compute  the  solution  by  using  a  negative  time  derivative  and  find 
the  solution  for  the  interval  from  t^  to  t^  where  i  <  j. 

It  should  be  noted,  however,  that  not  all  solutions  are  reversible. 
The  conditions  of  reversibility  for  the  solution  of  linear  difference 
equations  with  constant  coefficients  are  that  the  highest  and  lowest 
ordered  difference  terms  of  the  functions  must  have  coefficients  of 
unity  (Monroe,  1962). 


19 


SEL-71-057 


Zr**&*9&yXV*i !»Ba£**W>S**g*^*w 


F .  Modularity  and  Expandability 

The  size  of  the  DIC  sets  a  limit  on  the  complexity  of  problems  that 
can  be  solved.  This  complexity  varies  with  the  order,  degree,  and  the 
number  of  equations.  The  question  then  is  how  small  the  machine  can  be 
without  severely  restricting  its  usefulness.  In  addition,  to  retain 
high  speeds,  we  wanted  to  avoid  continuously  increasing  solution  time 
with  increasing  complexity  of  the  equations  to  be  solved.  The  answer 
proved  to  be  a  modular  system  with  serial-parallel  processing.  Each 
small  m<xiule  is  large  enough  only  to  solve  a  reasonable  range  of  prob¬ 
lems;  for  more  complex  problems,  it  is  only  necessary  to  add  additional 
modules.  The  required  connection  between  modules  is  minimal  and,  if  not 
used  by  another  module,  each  module  can  operate  independently  on  differ¬ 
ent  problems.  The  result  is  a  modular  system  that  can  be  closely  matched 
to  the  needs  of  the  user. 


G.  Adaptability 

It  is  important  that  the  system  be  adaptable.  The  design  is  such 
that  with  a  proper  I/O  buffer  the  DIC  can  be  connected  to  almost  every 
existing  GPC  because  the  actual  operation  of  the  DIC  is  independent  from 
the  GPC;  the  general-purpose  computer  is  used  only  to  translate  the  equa¬ 
tion  to  be  solved  into  a  machine  program  and  as  an  I/O  device  for  the  DIC. 
In  this  configuration,  the  length  of  time  required  for  the  programming 
and  execution  of  problems  containing  differential  equations  can  be  re¬ 
duced  considerably.  The  efficiency  of  the  total  computing  system  is 
greatly  increased  because  the  DIC  can  solve  the  differential  equations 
much  faster  than  the  GPC  and,  as  a  result,  the  GPC  is  free  to  execute 
other  portions  of  the  program  simultaneously.  As  noted  above,  the  DIC 
can  operate  totally  independently,  which  is  particularly  useful  in  con¬ 
trol  systems  and  circuit  applications.  The  DIC  can  realize  filters,  ex¬ 
tract  Fourier  coefficients  from  some  signal,  or  monitor  and  control  pro¬ 
cesses  (Yu,  1967;  Raimondi,  1971).  In  these  applications,  the  program 
is  usually  used  repetitively  and  can  be  entered  or  changed  manually. 
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Chapter  IV 

NUMBER  REPRESENTATION 


A.  Binary  Number  System 

Of  the  many  possible  number  systems,  the  binary  number  system  using 
2's  complement  arithmetic  is  the  most  logical  choice  not  only  for  the  DIC 
operation  but  to  ensure  compatability  with  other  computing  systems.  Be¬ 
cause  the  basic-unity  increments  represent  the  smallest  possible  change 
of  a  word  (a  binary  number),  the  value  of  the  increment  is  limited  to  0, 
+1,  or  -1.  If  a  single  bit  is  used  to  represent  these  increments,  then 
a  "1"  represents  +1,  a  "0"  represents  -1,  and  an  alternating  string  of 
"l"  and  "0"  represents  0.  Generally,  this  method  is  called  "binary  com¬ 
munication  of  increments" and  was  introduced  with  the  design  of  theMADDIDA. 

To  avoid  the  problem  of  "zero  oscillations, "  ternary  representation 
of  increments  can  be  employed.  Here  we  use  two  bits,  usually  one  sign- 
bit  and  one  magnitude-bit,  allowing  the  representation  of  the  three  de¬ 
sired  states  (0,  +1,  -1)  and  leaving  one  unused  state  (-0). 


B.  Circular  Number  System 

Binary  numbers  and  2’s  complement  arithmetic  leads  to  a  circular 
number  system  (Braun,  1963;  Mayorov,  1964).  If  we  start  with  some  number 
and  continuously  add  positive  increments,  it  will  eventually  reach  the 
positive  maximum;  with  the  next  positive  increment,  the  number  will  go 
to  the  minimum  value.  The  reverse  process  occurs  if  we  have  negative 
increments.  For  example,  if  the  range  of  a  number  is  -N  to  N-l,  then 
when  increasing  we  would  have  0,  1,  2,  ...,  (N-l),  -N,  -(N-l),  . . ., -2, 

-1,  0,...;  decreasing,  we  find  the  same  series  but  in  the  reverse  order. 

This  can  be  illustrated  by  considering  a  simple  example  of  a  binary 
number  register  restricted  to  three  bits.  Starting  from  zero,  we  add  a 
single  bit  a  time  to  obtain  a  series  of  eight  states,  as  shown  in  Ta¬ 
ble  3.  The  retractive  decimal  values  are  tabulated  in  the  tiird  column. 
If  we  consider  the  highest  order  bit  to  be  the  sign-bit  of  a  2’s  comple¬ 
ment  representation,  then  the  decimal  values  of  the  binary  numbers  appear 
in  the  fourth  column  and  we  obtain  (0  1  1)  +  (0  0  1)  =  (1  0  0)  which, 
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Table  3 

BINARY  REPRESENTATION  OF  NUMBERS 


State 

Binary 

Numbers 

Decimal 

Value 

Decimal  Value  for  2's 
Complement  Representation 

a 

0  0  0 

0 

0 

b 

0  0  1 

1 

1 

c 

0  10 

2 

2 

d 

Oil 

3 

3 

e 

10  0 

4 

-4 

f 

10  1 

5 

-3 

g 

110 

6 

-2 

h 

111 

7 

-1 

in  decimals,  is  (+3)  +  (+1)  =  (-4).  A  state  diagram  for  this  table 
would  show  eight  states  connected  in  a  ring  such  that  there  is  a  path 
from  every  state  to  both  of  its  nearest  neighbors.  One  can  see  that 
using  2's  complement  representation  and  allowing  overflows  will  result 
in  a  circular  number  system. 

Figure  7  is  a  graphical  representation  of  the  circular  number  sys¬ 
tem.  Increasing  numbers  move  counter-clockwise  on  the  circle;  decreas¬ 
ing  (positive  or  negative)  numbers  move  clockwise.  An  overflow  occurs 


Fig.  7.  CIRCULAR  NUMBER  SYSTEM. 

if  taken  over  a  long  period  of  time, 
ties,  however,  if  the  increment  of  a 


whenever  point  S  is  crossed  in 
the  counter-clockwise  direction 
and  the  value  of  the  number  goes 
from  M-l  to  -M;  an  underflow 
occurs  whenever  S  is  crossed  in 
the  clockwise  direction  and  the 
number  value  goes  from  -M  to 
M-l.  The  negative  portion  of  the 
circle  is  larger  by  one  unit  incre¬ 
ment  because,  for  some  n-bit  number, 
M-l  corresponds  to  2n  - 1  and  -M 
corresponds  to  -2n. 

This  circular  number  system 
produces  generally  accurate  results 
It  can  result  in  masked  instabili- 
number  is  zero  averaged  over  a  period 
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of  time  (but  not  instantaneously)  and  if  the  value  of  the  number  is  at 
or  near  either  the  positive  or  negative  limit. 


C .  Two-Loop  Number  System 

The  remainder  of  this  chapter  deals  with  a  solution  to  this  diffi¬ 
culty.  Let  the  states  in  Table  3  be  rotated  such  that  the  column  begins 
with  state  e  and  ends  with  state  d.  We  break  the  ring  by  not  allow¬ 
ing  any  transition  to  go  from  e  to  d  or  from  d  to  e;  instead, 
let  the  (+1)  transition  from  d  go  to  a  and  the  (-1)  transition  from 
e  go  to  h.  The  result  represents  a  two-loop  number  system,  with  the 
two  loops  joined  by  transitions  between  states  a  and  h. 

If  the  range  of  a  number  is  -N  to  N-l,  for  example,  and  if  the 
number  continuously  increases  starting  with  some  negative  value  -k,  we 
obtain  the  series  -k,  (-k  +  1), . . ., -1,  0,  +1,  +2, . . ., (N-l),  0,  +1,  +2, . . ., 
and  so  on.  A  continuously  decreasing  number  starting  with  some  positive 
value  k  results  in  a  similar  series r  i-k,  k-1,  ...»  +2,  +1,  0,  -1,  -2,..., 
(-N+1),  (-N),  -1,  -2...  . 

Using  binary  numbers  with  2's  complement  representation,  we  have 
one  more  negative  state  than  positive  states  for  any  given  number  of 
bits.  If  the  word  length  is  n  bits  (plus  sign-bit),  therefore,  the 
maximum  value  is  (2n-l)  and  the  minimum  value  (negative)  is  -2n. 

The  loop  interval,  however,  must  be  the  same  for  the  positive  and  nega¬ 
tive  loops.  As  shown  in  Fig.  8,  the  return  in  the  positive  loop  is  to 
zero  and  the  return  in  the  negative  loop  is  to  -1.  This  results  in  a 
one-unit  increment  separation  of  the  two  loops  but  ensures  equal  loop 
intervals.  It  is  also  possible  to  consider  the  2's  complement  repre¬ 
sentation  of  (-2n)  to  represent,  instead,  the  negative  equivalent  of 


Fig.  8.  TWO-LOOP  NUMBER  SYSTEM. 


zero  (-0);  in  the  negative  loop,  we  would  have  (2n-l)  steps  going 
from  -0  to  -(2n  -1)  instead  of  going  from  -1  to  -2n. 

This  two-loop  number  system  eliminates  any  instabilities  or  oscil¬ 
lations  such  as  those  that  occur  in  the  circular  number  system  because 
the  return  after  an  overflow  or  underflow  is  to  a  value  other  than  the 
minimum  or  maximum.  Table  4  compares  the  behavior  of  the  two  systems 
for  a  series  of  increments  of  the  dependent  variable  which  in  two  places 
contains  an  "average  zero  derivative."  Figures  9  and  10  illustrate  the 
staircase  approximations  of  this  function,  emphasizing  the  difference 

between  the  circular  and  two-loop  number  systems.  The  respective 
i 

f(n).  =  .E„  AZ.  are  plotted.  As  can  be  seen  from  both  the  table  and 
i  J=1  J 

Fig.  10,  the  "zero  oscillations"  of  the  circular  number  system  are  not 
predictable . 

Since  the  weight  of  the  increment  AZ  is  determined  by  the  loop 
length,  it  is  clear  that,  given  the  same  number  of  bits  for  both  sys¬ 
tems,  the  weight  of  AZ  in  the  two-loop  system  will  be  1/2  the  weight 
in  the  circular  number  system  and  AZ  will  occur  on  the  average  at 
twice  the  rate  of  that  in  the  circular  system. 


D.  Overlapping  Loops 

The  two  number  systems  described  can  be  considered  to  be  two  ex¬ 
tremes.  An  interesting  variation  occurs  if  we  extend  the  two  loops 
such  that  they  overlap  but  are  not  identical,  as  shown  in  Fig.  11.  As 
an  example,  let  the  positive  loop  return  to  -3  and  the  negative  loop 
return  to  +2.  Continuously  positive  increments  would  result  in  the 
following  series: 

— n ,  — n  +  1,  . . .,  ”3,  —2,  —1,  0,  +1,  +2,  +3,  .  • n— 1,  —3,  —2,  —1,  0,  +1.  .  . . 

and  negative  increments  would  generate 

n  —1 ,  n— 2,  ...,  +3,  +2,  +1,  0,  —1,  —2,  . .  ■ ,  — n  +  1 ,  — n ,  +2,  +1,  0,  —1,  ... 

The  second  column  in  Table  5  tabulates  the  behavior  of  this  number 
system  for  the  same  series  of  increments  used  in  Table  4  and  can  be 
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Table  4 


GENERATION  OF  INCREMENTS  FOR  CIRCULAR  AND  TWO-LOOP  NUMBER  SYSTEMS 


Arbitrary 

Circular  Number  System 
(n) 

Two-Loop  Number  System 
(n) 

Iopu  t 
(Ad) 

Remainder  Increment 

(=AZ) 

Remainder  Increment 

(=AZ) 

0 

0  0  0 

0 

0  0  0  0 

+1 

0  0  1 

0 

0  0  1  0 

+2 

Oil 

0 

011  0 

+2 

10  1 

+1 

0  0  1  +1 

+2 

111 

0 

011  0 

+2 

0  0  1 

0 

0  0  1  +1 

+2 

Oil 

0 

011  0 

+1 

10  0 

+1 

0  0  0  +1 

-1 

Oil 

-1 

111  0 

+1 

10  0 

+1 

0  0  0  0 

-1 

Oil 

-1 

111  0 

+1 

10  0 

+1 

0  0  0  0 

+3 

111 

0 

011  0 

+2 

0  0  1 

0 

0  0  1  +1 

+1 

0  10 

0 

0  10  0 

-1 

0  0  1 

0 

0  0  1  0 

+1 

0  10 

0 

0  10  0 

-1 

0  0  1 

0 

0  0  1  0 

+1 

0  10 

0 

0  10  0 

+1 

Oil 

0 

011  0 

+1 

10  0 

+1 

0  0  0  +1 

+1 

10  1 

0 

0  0  1  0 

-2 

Oil 

-1 

111  0 

-2 

0  0  1 

0 

10  1  0 

-2 

111 

0 

111  -1 

-2 

10  1 

0 

10  1  0 

-1 

10  0 

0 

10  0  0 

-1 

Oil 

-1 

111  -1 

-1 

0  10 

0 

110  0 

+1 

Oil 

0 

111  0 

-1 

0  10 

0 

110  0 

+1 

Oil 

0 

111  0 

-1 

0  10 

0 

110  0 

-1 

f<">i  -  A 

0  0  1 

0 

10  1  0 
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g.  11.  TWO-LOOP  NUMBER 


compared  to  the  behaviors  of  those  systems.  Figure  12  illustrates  the 
staircase  approximation  for  the  same  function,  using  the  number  system 
with  a  slight  hysteresis  of  one-bit.  "Hysteresis"  here  means  that  the 
loops  are  not  separated  by  one  or  more  bits  and  that  the  returns  f  rom 
the  maximum  and  minimum  are  to  two  different  values.  The  third  column 
in  Table  5  lists  the  same  function,  using  a  number  system  with  a  two-bit 
hysteresis;  its  staircase  approximation  is  shown  in  Fig.  13.  Although 
these  systems  do  not  eliminate  all  instabilities,  they  do  prevent  oscil¬ 
lations  in  the  case  of  very  small  function  changes  at  or  near  the  maxi¬ 
mum  or  minimum  level.  Input  sequences  that  conceivably  could  cause  in¬ 
stabilities  are  not  likely  to  be  encountered. 


fin) 


Fig.  12.  INCREMENTAL  INTEGRA¬ 
TION,  USING  A  NUMBER  SYSTEM 
WITH  1-BIT  HYSTERESIS. 


fin) 


Fig.  13.  INCREMENTAL  INTEGRA¬ 
TION,  USING  A  NUMBER  SYSTEM 
WITH  2-BIT  HYSTERESIS. 
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E .  Logical  Implementation 


1.  Circular  Number  System 

The  logical  implementation  of  the  circular  number  system  re¬ 
quires  only  2’s  complement  arithmetic  and  normal  overflow  detection. 
This  means  that  an  increment  is  generated  whenever  there  is  a  carry-bit 
(or  borrow  when  subtracting)  into  but  not  out  of  the  most  significant 
bit  (the  sign-bit)  or  when  there  is  a  carr  -bit  out  of  but  not  into  the 
most  significant  bit.  The  sign  of  the  increment  is  always  equal  to  the 
sign  of  the  number  before  addition  or  subtraction.  The  implementation 
of  the  two-loop  system  varies  slightly. 


2.  Two-Loop  Number  System 

Let  R,  Y,  and  R*  be  the  sign-bits  of  the  previous  remain¬ 
der,  the  integrand,  and  the  new  remainder,  respectively,  using  2's  com¬ 
plement  representation  ("0"  =  positive,  "1”  =  negative).  Here,  C  is 
the  carry  into  the  sign-bit  when  adding  or  the  borrow  when  subtracting; 
S  is  the  add/subtract  control  bit  and  is  "1"  for  addition  and  "0"  for 
subtraction.  We  then  want  R*  and  |AZj  as  a  function  of  R,  Y,  and 
S.  The  function  F  is  necessary  to  eliminate  C. 

From  the  map  of  R*  and  AZ  (Fig.  14a),  we  can  derive  the 

equation 


R*  =  RY 'S '  +  RYS  +  RY'C'S  +  R'YC'S  +  R’Y’CS*  +  RYCS  ’  (4.1a) 


or 


R*  =  |[(R  ©  Y  ©  S)R] '  [(R  ©  Y  ©  C)  (R  ©  Y  ©  S)']'}'  (4.1b) 


If  F  is  the  output  of  a  full  adder/ subtracter  on  R,  Y,  C,  and  S, 
we  see  from  the  map  of  F  (Fig.  14b)  that 


F  =  R*  if  |AZ |  =  0 


(4.2a) 


and 


F  /  R*  if  |AZ|  =  1 
In  the  latter  case  F  =  R'. 

29 


(4.2b) 
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R  ©  Y  ©  S 
(c) 

The  function  R©  Y©  S  (Fig.  14c)  covers  all  the  cases  where 
F  R*;  therefore,  R*  can  be  expressed  as 

R*  =  F(R  ©  Y  ©  S)’  +  (R  ©  Y  ©  S)  R  (4.3) 

Similarly  from  the  maps, 

|AZ|  =  R'Y'SC  +  R'YS'C'  +  RYSC '  +  RY'S'C  =  (R  ©  Y  ©  S)(Y  ©  C)  (4.4) 

but 

Y  ©  C  =  F  ©  R  (4.5) 


because 


F  =  R  ©  Y  ©  C 


(4.6) 


Therefore, 


|AZ|  =  (R  ©  Y  ©  S)(F  ©  R) 


(4.7) 


Again  from  the  maps, 

|AZ  |  =  (R©Y©S)(F©R)  =  R*  ©  F  (4.8) 

This  can  be  checked  quickly  by  manipulation  of  R*©F: 

R*  ©  F  =  [(R  ©  Y  ©  S)R  +  (R  ©  Y  ©  S)  ’F]  ©  F 

=  (R  ©  Y  ©  F )RF '  +  [ (R  ©  Y  @  S)R]  '  [(R  ©  Y  ©  S) ’F] 'F 

=  (R  ©  Y  ©  S)RF  '  +  (R  ©  Y  ©  S)R  'F  (4.9a) 

R*  ©  F  =  (R  ©  Y  ©  S)(F  ©  R)  (4.9b) 

The  sign  of  AZ  must  always  be  equal  to  the  sign  of  R.  The  AZ  gen¬ 
eration  of  the  two-loop  system  then  is  identical  to  that  of  the  circular 
number  system,  but  the  sign-bit  of  R  is  inhibited  from  changing  when¬ 
ever  | AZ  j  =  1. 


3.  Multi-Bit  Transfer  Two-Loop  Number  System 

The  above  equations  were  based  on  unity-increment  transfers, 
which  means  that  AZ  can  only  be  0,  +1,  or  -1.  To  allow  the  increment 
of  AZ  to  take  on  values  such  that  (-n)  <  AZ  <  (+n),  the  AZ  genera¬ 
tion  must  be  changed  from  detecting  single-bit  overflows  or  underflows 
to  multi-bit  detection.  Clearly,  we  could  duplicate  the  logic  described 
for  single-bit  detection  and  use  this  same  logic  for  every  AZ  bit.  For 
small  n,  this  may  not  be  too  costly  but,  as  n  increases,  this  approach 
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would  become  uneconomical;  furthermore,  each  detection  stage  introduces 
some  additional  delay  that  must  finally  propagate  up  to  the  highest  or¬ 
der  bit . 

Proper  multi-bit  over-  or  underflow  detection  for  the  two-loop 
number  system  can  be  accomplished  by  using  the  previously  described  logic 
inserted  between  the  most  significant  bit  of  R  and  its  sign-bit  in  the 
same  manner  as  in  the  unity-increment  case.  Let  that  portion  of  R  which 
is  to  be  read  out  as  AZ  be  AZ*,  as  shown  in  Fig.  15. 


SIGN 

of 

R 


DE¬ 

TECTOR 


Al* 

LEAST 

SlGNlinCANT 

B’T 

■v 

R 


Fig.  15.  INCREMENT  DETECTION  FOR  MULTI-BIT  TRANSFER  TWO-LOOP  SYSTEM. 


All  the  previous  equations  hold,  with  the  exception  of  the 
equations  for  |aZ|,  and  their  terms  refer  to  the  same  variables  as 
before.  Again, 

R*  =  |  [(R  ©  Y  ©  S)R]  '  [(R  ©  Y  ©  C)(R  ©  Y  ©  S)’]’}’  (4.10) 


and 


R*  ©  F  =  (R  ©  Y  ©  S)(F  ©  R)  (4.11) 

however,  here  R*  ©  F  does  not  give  the  magnitude  of  AZ  but  rather 
serves  to  indicate  whether  |aZ|  is  at  its  maximum.  If  AZm  is  the 
most  significant  magnitude  bit  of  AZ,  then 

AZ  =  R  •  (R*  ©  F) ’  (4.12a) 

m 

or 

AZ  =  R [(Y  ©  S)  +  F]  (4.12b) 

m 
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Therefore,  if  R*@  F  =  0,  we  can  determine  AZ  by  taking  the  sign- 
bit  of  R,  copying  this  bit  twice  as  the  sign  and  highest  order  bits 
of  AZ,  and  reading  out  the  remaining  AZ*  bits.  All  AZ*  bits  must 
then  be  reset  to  be  equal  to  the  sign-bit  of  AZ.  For  example,  let  the 
R  register  contain 

Sign  AZ*  R  -  AZ* 

i - 1  i - 1  i - i 

0  010  10110... 

and  let  AZ*  contain  three  bits;  then,  if  the  least  significant  bit  of 
R  is  on  the  right,  the  left-most  bit  is  the  sign-bit  of  R,  the  next 
three  bits  are  AZ*,  and  the  remaining  bits  are  R-AZ*.  If  R*©F=0, 
then 


AZ  =  0  0  0  1  0 

and  the  new  value  for  the  R  register  is 

0  000  10110... 

For  an  example  with  negative  R,  let  R  be 

1  101  10101... 


If  R*  +  F  =  0,  then 

AZ  =  1  1  101 

and  the  new  R  is  given  by 

R  =  1  111  10101... 

If  R*  ©  F  =  1,  we  have  a  carry  (or  borrow)  into  the  sign-bit  which, 
however,  is  inhibited  from  changing.  This  means  that  AZ  car.  now  be 
determined  by  again  taking  the  sign  of  R  as  the  sign  of  AZ,  copying 
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the  inverse  sign  of  R  into  the  most  significant  bit  of  AZ,  and  read¬ 
ing  out  the  remaining  AZ*  bits.  Again,  all  AZ*  bits  should  be  reset 
to  be  equal  to  the  sign-bit  of  AZ. 

In  the  case  where  R*  ©  F  =  1,  however,  it  should  be  noted 
that  all  bits  in  the  AZ*  register  will  always  be  equal  to  the  sign  of 
AZ  and  therefore  need  no  resetting  because  the  AZ*  register  is  always 
reset  after  readout,  and  the  maximum  value  that  can  be  added  to  a  AZ* 
register  of  n  bits  (excluding  the  sign-bit)  is  (2n-l)  plus  a  carry 
(or  borrow)  from  the  lower  order  bits  of  H.  As  a  result,  because  AZ 
is  a  word  that  is  longer  by  one-bit  than  AZ*,  the  maximum  value  achieved 
by  a  AZ  of  n  bits  (plus  sign-bit)  is  2n  1  and  the  minimum  value  is 
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Chapter  V 

THE  FUNCTIONAL  BLOCK 


Conceptually,  the  basic  DIC  module  is  made  up  of  a  number  of  iden¬ 
tical  functional  blocks,  each  containing 


(1)  memory  locations  for  tne  integrand,  integral,  and  the 
dependent- and  independent-variable  increments 

(2)  an  arithmetic  unit  (processor)  which,  when  given  the 
integrand  increments  and  independent-variable  incre¬ 
ments  as  inputs,  will  produce  the  integral  increments 
and  remainder 


Each  block  (Fig.  16)  receives  four  inputs  AV^,  AX^,  A,  and  B 
and  generates  as  its  output 


£&±  ~  A  B  ViAXi 


(5.1) 


where  is  the  dependent  variable 
being  integrated  with  respect  to  X . 
The  processor  contains  the  functions 
of  integrand -increment  and  integral- 
increment  multiplications  by  A  end 
B,  respectively. 


AW  *ABVAX 


Fig.  16.  FUNCTIONAL  BLOCK  OF 
PROPOSED  MACHINE. 


A.  The  Integrating  Function 

Figure  17  is  a  flow  diagram  of  the  functional  block.  We  shall 
first  consider  the  integration  without  multiplications  by  A  or  B. 


fig.  17.  STRUCTURE  OF  FUNCTIONAL  BLOCK. 
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The  flow  diagram  of  this  sub-block  is  that  portion  of  Fig.  17  which  is 
enclosed  by  the  dotted  line.  The  inputs  are  and  AY  and  the  out¬ 
put  is  AZ  =  Y  The  equations  describing  the  operation  of  the  sub¬ 

block  are 


Y. 

1 


=  Y  + 
o 


i 


Y. 

1 


=  v 


‘i-1 


+  AY. 

l 


and 


R.  =  R.  ,  +  Y^A(.  -  M  AZ. 
l  l-l  i  l  i 


AZ. 


l 


Y  /j!.  +  R 
l  l 


i-1 


M 


Explicitly, 


and 


AZ.  =  (R.  ,  +  Y4AX.  >  M)  -  (R.  ,  +  Y.AJC .  <  -M) 
i  l-l  i  i  —  i-l  i  i 


Substituting  for 


or 


Z. 

l 


Y  AX  , 
J  J 


Z. 

l 


+  R 

o 


(5.2a) 


(5.2b) 


(5.3) 


(5.4) 


(5.5) 


(5,6) 


(5.7a) 


(5 .7b) 
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which  is  approximately 


(5.7c) 


In  these  equations,  M  is  the  capacity  of  the  integrand  and  re¬ 
mainder  words  and  is  2°,  where  n  is  the  number  of  bits  not  including 
the  sign-bit.  Equations  (5.3)  to  (5.5)  implement  the  two-loop  number 
system.  Equation  (5.3)  can  be  rewritten  as 


Ri  =  Ri-1  +  Yi^i  "  SigD  (Ri-l)  M  (5.8) 

since  the  sign  of  AZ.  is  always  equal  to  the  sign  of  R^  as  shown 
in  Eq.  (5.5),  and  the  magnitude  of  AZ^  is  equal  to  1  or  0.  The  effect 
is  that  if  R  is  in  the  neighborhood  of  but  less  than  M  and  the 
are  positive,  then  R  will  reach  (M-l)  and  with  the  next  unit  incre¬ 
ment  will  go  to  zero  instead  of  to  the  most  negative  value  (-M).  A  sim¬ 
ilar  but  reverse  process  occurs  if  R  and  the  are  negative.  In 

this  case  R  will  eventually  reach  -M  and  with  the  next  increment  will 
go  to  -1.  Thus  we  have  two  separate  loops,  the  positive  going  from  0  to 
M-l  and  the  negative  going  from  -1  to  -M. 


B.  Constant  Multiplication 

Now  let  us  consider  the  total  functional  block.  Even  linear  differ¬ 
ential  equations  with  constant  coefficients  require  thav  some  terms  be 
multiplied  by  constants.  In  addition,  a  method  of  problem  scaling  relies 
on  constant  multiplication  of  the  integrand  and  the  integral  increments. 
The  functional  block  therefore  contains  both  of  these  multiplications. 

Multiplication  by  A  will  be  called  "pre-multiplication"  since  it 
occurs  before  integration;  similarly,  multiplication  by  B  will  be 
called  "post-multiplication"  because  it  is  an  operation  on  the  integral 

result.  Pre-multiplication  is  limited  to  positive  constants  (A)  which 

a 

are  positive  integer  powers  of  2  (A  =  2  ,  a  =  positive  integer);  post¬ 
multiplication  is  limited  to  positive  or  negative  constants  (B)  whose 
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absolute  value  Is  equal  to  or  less  than  1  (-1  <  B  <  +1).  Using  the 

multiplication  factors  A  and  B  simultaneously  allows  the  integral 
to  be  multiplied  by  any  desired  constant;  for  example,  the  multiplica¬ 
tion  factor  of  -3  is  derived  by  setting  A  =  4  and  B  =  -3/4. 

The  limitations  set  on  A  and  B  are  dictated  by  the  operating 
principle  of  the  DIC.  Considering  post-multiplication  first,  it  is 
clear  that  the  integral  increment  is  generated  at  some  rate  determined 
by  both  the  dependent  and  independent  variables  and  that  this  rate  can¬ 
not  exceed  the  maximum  machine  rate  which  is  equal  to  the  maximum  num¬ 
ber  of  iterations/sec .  The  highest  possible  rate  occurs  when  the  inde¬ 
pendent-variable  increment  has  a  rate  equal  to  the  maximum  machine  rate 
and  when  the  absolute  value  of  the  corresponding  dependent  variable  is 
at  a  maximum.  Under  these  conditions,  the  integral-increment  rate  is 
approximately  equal  to  the  maximum  machine  rate.  Any  rate  multiplica¬ 
tion,  therefore,  must  be  limited  to  factors  whose  absolute  values  are 
equal  to  or  less  than  1. 

Similar  arguments  apply  to  pre-multiplication.  Given  an  integrand 
word  with  maximum  length  of  n  bits,  the  highest  precision  is  achieved 
by  setting  the  unit  increment  equal  to  1/(2°  -1)  of  the  maximum  possi¬ 
ble  integrand  value.  The  unit  increment  is  defined  to  be  the  smallest 
allowable  increment  of  a  variable  or  function.  If  in  some  calculation 
we  desire  maximum  accuracy  and  if  the  dependent-variable  increment  con¬ 
sists  of  a  single-unit  increment,  then  it  is  not  possible  to  multiply 
this  increment  by  any  factor  less  than  1.  The  restriction  of  the  factor 
A  to  powers  of  2  is  a  practical  consideration  and  simplifies  the  logical 
implementation.  Since  the  pre-multiplying  factor  appears  mathematically 
outside  the  integral,  it  is  not  necessary  to  include  sign  reversal  if 
that  is  available  in  post-multiplication. 

If,  in  some  calculation,  precision  is  to  be  sacrificed  for  speed, 
the  maximum  length  of  the  integrand  words  may  be  scaled  down  by  uniformly 
pre-multiplying  the  integrand  increments  of  all  integrands  by  the  same 
factor.  This  new  factor  then  is  considered  to  be  the  unity  multiplica¬ 
tion  factor  and  any  further  multiplication  necessary  for  the  function  is 
superimposed.  In  th: s  case,  the  factor  A  may  be  less  than  1  (A  =  2  , 
a  =  0,  ±1,  ±2  ...)  but  still  positive. 
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n 
^  . 


Implementation 


The  implementations  of  pre-  and  post-multiplication  are  quite  dif¬ 
ferent  v  *om  each  other.  In  pro-multiplication,  the  value  of  the  incom¬ 
ing  increment  is  multiplied  by  the  multiplication  factor  and  the  result 
is  immediately  added  to  the  integrand  (pre-multiplication  is  the  normal 
multiplication  of  two  numbers).  For  post-multiplication,  this  method 
is  only  possible  and  necessary  for  multi-bit  transfer  machines.  If  we 
want  the  output  to  be  a  single  magnitude  bit  which  limits  it  to  the  val¬ 
ues  of  +1,  -1,  or  0,  then  some  form  of  rate  multiplication  is  required. 
We  could  use  a  second  identical  functional  block,  set  the  value  of  the 
integrand  to  be  equal  to  the  desired  post-multiplication  factor,  and  use 
the  output  of  the  first  block  as  the  independent-variable  increment  in¬ 
put.  The  dependent-variable  increment  would  remain  zero.  This,  in  fact, 
is  the  method  normally  used  in  DDA  machines. 

In  a  parallel-processing  machine  the  use  of  integrators  for  constant 
multiplication  becomes  rather  expensive  and  the  availability  of  integra¬ 
tors  is  quickly  exhausted.  In  a  serial-processing  machine  this  method 
causes  the  solution  speed  to  decrease  considerably,  since  in  this  case 
the  time  required  for  constant  multiplication  is  equal  to  that  required 
for  integration . 

Incorporating  post-multiplication  into  the  basic  functional  block 
somewhat  reduces  the  extra  hardware  required  otherwise,  and  also  decreases 
the  total  time  spent  on  multiplication.  Both  time  and  hardware  can  be 
saved,  for  example,  by  eliminating  the  circuit  function  which  for  the  in¬ 
tegration  function  adds  the  dependent-variable  increment  to  the  integrand. 

Including  the  two  multiplication  functions  in  the  functional  block 
results  in  the  equations  below  describing  its  operation.  In  these  equa¬ 
tions,  M  and  M2  are  the  scale  factors,  R  and  R2  are  the  remain¬ 
ders  of  the  integrating  function  and  post- mul tiplication,  respectively, 
and  AV  and  AW  are  problem  variables  and  are  the  dependent -variable 
increment  and  the  multiplied  integral  increment. 
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Y  =  A  V  =  A  V  +  A  ^ 
i  1  l-l  1 


R.  =  R.  ,  +  Y.AX,  -  M  AZ. 
l  l-l  l  i  l 


R2.  =  R2.  ,  +  B  AZ.  -  M2  AW. 
i  i-1  1  i 


AZ.  = 

l 


A  V./tf.  +  R.  -  R. 

l  l _ l-l _ l 

M 


B  AZ.  +  R2.  -  R2 

aw  -  1  1-1  1 

AWi  "  M2 


This  equation  is  implemented  thus: 


AW  =  (R2  ,  +  B  AZ.  >  M2)  -  (R2.  +  B  AZ.  <  -M2) 

i  l-l  i  -  l-l  i 


The  algorithm  then  yields 


W  =  M 

l 


/  AW  .  I  M2  +  R2 . 

A  7 


+  R. 


Substituting  for  AW^  we  have 


W.  =  M 
1 


B  AZ  +  R2  -  R? 

M2  /  - - - -  J  i  +  R2. 

j=l 


M2 


+  R. 


W.  =  M 

l 


/  (B  AZ.  +  R2.  -  R2.)  +  R2 

f-i  j  j-1  .1  i 


J=1 


+  R. 


or 


W  =  M 
i 


J. 

/  B  AZ.  +  R2 

J- i  J 


+  R. 


(5.10) 

(5.11) 

(5.12) 

(5.13) 

(5.14) 

(5.15) 

(5.16) 

(5.17a) 

(5.17b) 

(5.17c) 
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Substituting  for  AZ.  results  in 


=  M 


-I 

j=l 


A  V.AX.,  +  R  , 

_ J  J  J"1 

M 


+  R2 


+  R. 


(5.18a) 


or 


W  =  B 


s 

j=l 


V  AX  . 
j  J 


+  R  -R.  +  MR2  +  R 


(5.18b) 


which  finally  can  be  written  as 

i 

W  =  A  B  /  V.AX,  +  R  +  M  R2  (5.19) 

i  j  j  o  o 

In  the  preceding  sections  all  numbers  were  considered  to  be  integers. 
It  is  customary,  however,  to  normalize  all  values  such  that  the  maximum 
absolute  value  of  the  integrand  may  not  exceed  unity.  In  this  case,  M  =  1 

and  the  smallest  possible  increment  of  Y  is  AY  =  1/2°,  where  n  is 

again  the  number  of  bits  excluding  the  sign-bit.  It  is  clear,  however, 
that  the  largest  positive  and  negative  values  which  may  be  contained  in 
Y  are  (l-l/2n)  and  (-1),  respectively.  It  is  therefore  not  possi¬ 
ble  to  multiply  by  +1  without  some  modification.  A  simple  method  to 
allow  multiplication  by  +1  is  to  include  an  additional  bit  in  the  post¬ 
multiplier  representing  unity.  This  is  not  a  problem  in  the  case  where 
separate  integrators  are  used  for  constant  multiplication  because  the 
second  integrator  in  that  case  is  deleted. 

Using  normalized  values,  the  equations  of  the  functional  block  are 

Y  =  A  Vi_1  +  A  AVt  (5.20) 

R1  =  R  x  +  Y1AXi  -  AZt  (5.21) 
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(5.22) 


R2 . 


l 


R2 .  ,  +  B  AZ.  -  AW. 
3.-1  1  1 


Z. 

l 


V.AX. 


3  3 


+  R 

o 


w 

1 


=  A  B 


y 

j=i 


V.AX. 
3  3 


+  R  +  R2 
o  o 


(5.23) 


(5.24) 


D.  Post-Multiplication  by  a  Variable 

Let  us  again  consider  rhe  basic  functional  block  with  inputs  AY, AX 
and  output  AZ  =  Y  AX  and  let  us  assume  that  the  equation  solution  re¬ 
quires  the  generation  of  g  dz  =g  G  AZ  =  AW.  Taking  a  separate  integra¬ 
tor,  we  may  obtain  the  increments  AW  by  using  AZ  as  the  independent- 
variable  increment  and  AG  as  the  dependent -variable  increment,  thus 
giving  AW  =  G  AZ  =  G  Y  AX.  The  same  result  may  be  achieved  by  using  a 
built-in  post-multiplier;  however,  the  required  input  would  be  G  and 
not  AG  which  means  that  G  must  have  been  generated  elsewhere  in  the 
problem  solution.  In  problems  requiring  function  multiplications,  the 
capability  of  built-in  post-multiplication  by  a  function  can  save  con¬ 
siderable  time  and  hardware.  A  few  simple  examples  will  demonstrate 
potential  time  savings.  The  functional  block  symbol  (Fig.  18)  of  the 
proposed  machine  has  one  extra  output  A  Y  £X  which  will  be  discussed 
later. 


avAx 

ABVAx  FiS*  18  •  FUNCTI°NAL 
BLOCK  SYMBOL. 


Example  1 . 

Inversion:  Given  dy,  generate  1/y.  Figure  19a  is  the  conven¬ 
tional  DDA  diagram,  and  Fig.  19b  is  the  diagram  using  built-in  post¬ 
multiplication.  The  inverse  is  generated  by  solving 


Ax- 


Av- 


a] _ AV 


) 


SEL-71-057 


42 


dy 


(5.25) 


r*d(l/y) 


b.  Solution, using 
built-in  post- 
multiplication 


Fig.  19.  GENERATION  OF  1/y. 


Example  2 . 

Generate  the  functions  sin  wx  and  cos  Wx  by  solving  the  differ¬ 
ential  equation 

d2y  =  -w2y(dx)2  (5.26) 

Then 


and 


2  2  2 
d  (sin  wx)  =  -w  sin  Wx(dx) 


(5.27) 


d(sin  Wx)  =  w  cos  Wx  dx  (5.28) 

Figure  20a  is  the  solution  diagram  in  which  both  sin  wx  and  cos  Wx 
are  available  as  whole  words  and  in  incremental  form.  Figure  20b  is  an 
alternate  diagram  which  uses  one  less  integrator  but  generates  w  cos  Wx 
in  place  of  cos  Wx.  Figure  20c  is  the  diagram  for  built-in  post-multi¬ 
plication  where  both  the  desired  functions  are  available  and  the  circuit 
uses  only  two  integrators. 
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dx 


cos  <*»x 


a.  Conventional  DDA  solution 


c.  Solution,  using  built-in  post¬ 
multiplication 


Fig.  20.  GENERATION  OF  sin  Wx  AND  cos  <ex. 


Example  3 . 

2  2  1/2 

Given  dx  and  the  constant  a,  generate  the  increments  d(x  +a  ) 

2  2  -1/2  2  2 
and  d (x  +a  )  .  Figure  21a  is  the  DDA  diagram  where  d  |n(x  +a  ) 

is  generated  in  addition  to  the  desired  outputs.  Figure  21b  illustrates 

the  use  of  built-in  multipliers.  An  alternate  diagram  (Fig.  21c)  has 
2  2  *’2  -1/2 

d  ,gn(x  +a  )  and  d(x“  +  a  )  as  outputs.  The  square  root  is  derived 

from 


dv  = 


(5.29) 
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and  the  natural  logarithm  is  from 


*  '***.?*' 


d  fn  y  =  i  dy 

The  inverse  is  derived  as  in  example  1  from 


(5.30) 


) 


b.  Solution, using  built-in  post -multiplication 


f*(*2+o2) 

c.  Alternate  solution,  using  built-in  post¬ 
multiplication 


2  2  1/2  2  2  “1/2  2  2 
Fig.  21.  GENERATION  OF  d(x  +a  )  ,  d  (x  +  a  )  ,  AND  dfn(x  +  a  ). 
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E .  Simultaneous  Integration  and  Post-Multiplication 

As  has  been  shown,  some  time  saving  may  be  gained  by  including 
post-multiplication  in  the  functional  block.  The  greatest  advantage 
of  built-in  post-multiplication,  however,  is  that  for  single-bit  (plus 
sign-bit)  communication  it  is  possible  to  perform  post-multiplication 
simultaneously  with  the  integration.  At  the  beginning  of  any  Integrat¬ 
ing  cycle  the  following  values  are  known:  AY.,  Y.  R  ,,  AX  ,  and 

1  l-l  i-1  1 

the  post-multiplication  factor  B.  Given  these  known  quantities  it  is 
possible  to  make  some  prediction  about  the  outcome  of  the  integrating 
cycle . 

To  proceed  with  the  multiplication,  we  must  know  the  sign  and  the 
magnitude  of 


AZ.  = 

l 


Y 4AX.  +  R,  .  -  R. 
i  i  i-1  i 


(5.31) 


Again  using  normalization  to  unity,  we  know  that  the  magnitude  of  AZ^ 
can  only  be  zero  or  unity.  The  sign  of  AZ^  may  be  determined  as  fol¬ 
lows.  The  absolute  value  of  Y,AX.  must  be  less  than  or  equal  to  M; 

i  l 

the  same  is  true  for  R  and  R  „ .  If  we  assume  that  R.  „  has  some 

i  i-1  i-I 

positive  value  M-C  and  if 


Y.  =  M  -  D  (5.32) 

l 

then 

Ri  -  £&  =  (M  -  C)  +  (M  -  D)  (5.33) 


for  positive  Y  AX  ,  where  0  <  C  <  M  and  0  <  D  <  M.  Therefore, 
ii  -  -  -  - 


R.  -  AZ.  =  2M  -  (C  +  D)  >  0  (5.34) 

l  i  — 


which  requires  that  AZi  >  0.  For  negative  we  have 

R_,  -  AZ.  =  (M  -  C)  -  (M  -  D)  =  D  -  C  <  M  (5.35) 

i  i  — 


SFL-71-057 


46 


and 


D  -  C  >  -M 


which  requires  that  AZ^  =0.  If  ^  is  negative,  the  results  are 

AZa  <  0  for  negative  (5.36a) 

AZt  =  0  for  positive  (5.36b) 

Hence,  the  outcome  of  the  integrating  cycle  can  be  partially  predicted 
by  the  knowledge  of  R^  ^  only. 

We  may  proceed,  therefore,  with  the  process  of  post-multiplication 
simultaneously  with  the  integrating  cycle.  The  result  of  the  post-mul¬ 
tiplication  is  then  stored  and  used  if  |aZ^ |  =  1  and  is  discarded  oth¬ 
erwise.  The  additional  time  required  for  post-multiplication  is  only 
one  gate  delay  and  becomes  essentially  insignificant  in  comparison  to 
the  integrating  cycle  time.  The  process  is  valid  for  both  multiplica¬ 
tion  by  a  constant  as  well  as  a  variable. 

If  we  now  consider  the  linear  differential  equation 

dy  +  ady  +  bdy  +  cydt  =  F  cos  tot  dt  (5.37) 

we  car  see  that  four  integral  outputs  require  multiplication  by  constants 
but  are  also  used  without  the  multiplication  as  inputs  to  other  blocks. 

If  the  output  of  a  functional  block  must  be  multiplied  by  a  constant  or 
a  function,  that  multiplication  may  occur  within  the  block  only  if  the 
integral  output  (unmultiplied )  is  not  used.  Generally,  if  each  func¬ 
tional  block  has  only  one  output  and  if  each  output  is  used  as  the  inde¬ 
pendent-variable  increment  input  to  a  second  block  and  not  used  elsewhere, 
the  two  functional  blocks  may  be  combined  into  one  provided  that  the  de¬ 
pendent  variable  of  the  second  functional  block  is  already  being  generated 
in  some  other  block  or  that  it  is  a  constant.  If  we  allow  the  functional 
block  to  have  two  outputs  (the  integral  and  the  post-multiplied  integral) 
as  was  shown  in  Fig.  18,  then  the  only  restriction  for  simultaneous 
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post-multiplication  is  that  the  multiplier  must  be  generated  elsewhere 
or  must  be  constant. 


F .  Extended  Simultaneous  Integration  and  Post-Multiplication 

Theoretically,  it  is  possible  to  extend  the  above  method  to  allow 
the  multiplication  of  the  integral  increment  by  many  functions  simulta¬ 
neously.  For  example,  the  three  integrators  in  Fig.  21a  which  contain 

-1/2  2  2  -1/2 

the  dependent  variable  f  =  (x  +  a  )  could  be  included  in  a 

single  functional  block;  although  all  dependent  variables  are  identical 
except  for  one  sign  reversal,  they  could  be  different  as  long  as  they 
are  available  or  are  constants. 

Simultaneous  post-multiplication  is  meaningful  only  in  serial-pro- 
cessing  machines.  Let  us  consider  such  a  machine  with  an  extended  mul¬ 
tiplication  capability  which  can  handle  p  post-multipliers.  Given  a 
differential  equation  and  its  solution  diagram,  all  integral?  are  first 
grouped  into  two  categories. 

(1)  The  independent-variable  inpub  to  the  integral  is  the 
independent  variable  of  the  total  equation,  or  the 
dependent-variable  input  is  a  function  not  generated 
elsewhere . 

(2)  The  independent-variable  input  is  the  output  of  some 
other  integrator,  and  the  dependent  variable  is  con¬ 
stant  or  is  generated  by  one  of  the  integrators  in 
the  first  category. 

If  there  are  n  integrators  in  category  1  and  m  integrators  in  cate¬ 
gory  2,  the  solution  would  require  n+m  machine  cycles  her  iteration 
for  a  traditional  serial  machine.  The  solution  would  require  n  cycles 
in  the  machine  with  extended  post-multiplication  provided  that  the  num¬ 
ber  of  multipliers  assigned  to  any  one  of  the  n  integrators  in  category 
1  does  not  exceed  p. 

Although  the  additional  time  required  for  multiplication  in  the  ex¬ 
tended  unit  is  r^nimal,  extra  time  is  necessary  for  assigning  and  routing 
the  additional  outputs  of  each  unit;  furthermore,  if  the  multipliers  are 
variables,  some  time  must  be  expended  to  fetch  the  variables.  As  pointed 
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out  previously,  usually  all  integrals  of  category  1  require  one  multi¬ 
plication,  the  probability  of  encountering  two  or  more  function  multi¬ 
plications  is  considerably  reduced,  and  it  would  be  unlikely  that  all 
the  variables  would  be  generated  elsewhere.  Therefore,  while  a  single 
simultaneous  multiplication  can  reduce  the  iteration  time  by  1/2,  it  is 
not  economical  to  include  the  extend  i  post-multiplication  (by  two  or 
more  functions)  capability. 


G.  Multi-Bit  Transfer 

Simultaneous  multiplication  as  described  is  not  possible  if  a  sys¬ 
tem  of  multi-bit  increment  communication  is  used  because  the  magnitude 
of  the  output  cannot  be  predicted.  Furthermore,  it  is  not  even  possible 
to  predict  the  sign  of  the  output.  Again,  the  known  quantities  at  the 
beginning  of  the  integral  cycle  are  R^  Y  AY^  and  AJ^  ■  Here, 

however,  the  aligned  (after  scaling)  maximum  values  of  R  and  Y  are 
not  equal.  If  My  is  the  maximum  value  of  Y  and 
value  of  R,  then 


is  the  maximum 


Mr<My 


(5.38) 


where  =  My  is  the  case  of  single-bit  magnitude  communication  and 
can  here  be  ignored .  If 


R  =  MR  -  C  (5.39) 

and 

Yi  =  My  "  D  (5.40) 


then,  for  positive  R  and  positive  Y^^ 


-  AZ  =  (Mr  -  C)  +  (My  -  D) 

(5.41a) 

-  AZt  =mr+my“  (C  4  D)  >  0 

(5.41b) 
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because 


0  <  C  <  M 

-  -  R 

°  <  D  <  My 


(5.42) 


As  a  result,  AZ^  >  0  but  the  magnitude  of  AZ^  is  unknown.  For  nega- 
t ive  Y . AX . , 

l  l 

R.  “  AZ.  =  <Mr  -  C)  -  (My  -  D)  (5.43a) 

R.  -  AZ,  -M  -M-C  +  D  (5.43b) 

1  1  R  Y 


But  My  >  Mr;  therefore, 

\  -  My  <  0 


(5.44) 


D  may  be,  but  is  not  necessarily,  greater  than  C. 
If  C  >  D, 


R.  -  /AZ.  <  0  and  AZ,  <  0  (5.45) 

1  i  1  — 

If  C  <  D,  nothing  at  all  can  be  said  about  the  sign  of  R^  -AZ^;  it 
may  be  positive  or  negative.  Similar  results  are  obtained  for  negative 
R  1»  It  is  no.  possible,  therefore,  to  have  simultaneous  multiplica¬ 
tion  if  multi-bit  communication  is  used. 

Floating-point  arithmetic  combined  with  floating-point  single-bit 
magnitude  communication  can  be  handled  identically  to  fixed-point  single¬ 
bit  communication  with  respect  to  post-multiplication.  The  only  other 
requirement  is  the  addition  of  the  integral  output  exponent  to  the  expo¬ 
nent  of  the  multiplier.  Both  of  these  exponents  are  known  at  the  begin¬ 
ning  of  the  integral  cycle  since  the  exponent  of  AZ  must  always  be 
equal  to  that  of  the  integrand  Y. 
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H.  Floating-Point  Arithmetic 


DDAs  conventionally  use  fixed-point  arithmetic,  with  all  quant4-  J.es 
scaled  such  that  their  absolute  value  does  not  exceed  unity.  One  excep¬ 
tion  is  the  BFPDDA  (binary  floating-point  digital  differential  analyzer) 
designed  by  J.  L.  Elshoff  and  P.  T.  Hulina  (1970),  which  uses  floating¬ 
point  unnormalized  values  throughout.  This  requires  that  at  least  three 
exponents  must  be  used  for  each  integral:  one  for  the  integrand  Y,  one 
for  the  Independent  variable  AX,  and  one  for  the  integral  increment 
output  AZ  and  its  remainder  R.  The  advantage  is  that  problems  may 
be  entered  without  normalization;  the  disadvantage  is  that  several  ex¬ 
ponents  must  be  recalculated  during  every  iteration  for  each  integral. 
This  is  costly  in  time  and  hardware.  The  following  proposed  alternative, 
while  maintaining  the  most  important  features  of  floating-point  arithme¬ 
tic  (such  as  dynamic  scaling  and  increased  computing  accuracies  due  to 
reduced  delays),  uses  a  single  exponent  for  all  quantities  in  any  one 
integral  by  employing  normalized  values  of  the  independent  variable. 

The  system  is  best  explained  by  beginning  with  a  conventional  fixed- 
point  L'DA  integrator.  The  absolute  values  of  the  Integrand  Y  and  the 
remainder  R  are  less  than  or  equal  to  unity.  The  normalized  indepen¬ 
dent-variable  input  AX  and  the  normalized  output  AZ  are  ±1  or  0, 
and  the  dependent-variable  input  AY  has  a  magnitude  of  less  than  1. 

Let  us  now  assume  that  R  and  Y  are  divided  by  2,  which  is  equivalent 
to  a  right  shift  by  one  bit.  To  maintain  the  balance,  the  AY  inputs 
must  be  divided  by  2  rnd  ‘.he  value  of  AZ  must  be  multiplied  by  2. 
Nothing  however  changes  for  the  independent-variable  input.  If  we  use 

exponents  of  2  to  indicate  the  number  of  right  shifts,  then  the  original 

1 

value  of  Y  =  *yiy2y3  '5ecoires  Y  =  *Syly2^3  *2  ’  where  S  is  the  in¬ 
serted  bit  which  is  equal  to  the  sign-bit  of  Y.  Because  AY  and  R 
are  shifted  simultaneously  and  by  the  same  number  of  bits  as  Y,  their 
exponents  are  identical  to  the  exponent  of  Y  and  may  be  deleted.  The 
output  AZ  is  generated  in  the  same  way  as  before,  and  the  required 
multiplication  can  be  achieved  by  merely  appending  the  exponent  of  Y. 
Thus  only  a  singlf  exponent  storage  and  a  single  exponent  calculation 
are  required  for  the  basic  integrating  cycle. 
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It  was  previously  assumed  that  the  incoming  AY  increment  had 
tne  same  exponent  as  Y.  This  is  true  if  we  begin  with  a  completely 
scaled  problem  setup.  Once  any  one  of  the  integrands  is  shifted, 
however,  the  assumption  no  longer  holds.  Because  several  integral 
outputs  may  be  summed  to  constitute  a  AY,  it  is  necessary  to  equal¬ 
ize  all  of  their  exponents  to  be  that  of  Y  which,  of  course,  is 
also  true  for  the  BFPDDA.  If  the  inputs  rather  than  the  outputs  are 
stored,  only  one  such  output  will  arrive  at  any  one  time;  therefore, 
each  increment  of  AY  may  be  scaled  as  it  arrives  and  the  stored  AY' 
will  always  have  the  proper  exponent.  Using  the  stored  input  method 
does,  however,  create  one  problem.  Each  output  can  be  used  as  the 
input  to  many  other  blocks  and,  therefore,  is  to  be  stored  simultane¬ 
ously  in  the  AY  storage  of  any  or  all  functional  blocks.  This  i*e- 
quires  that  the  exponents  of  all  AY  (which  are  the  exponents  of  all 
integrands)  must  be  available  at  the  end  of  each  processing  cycle. 

This  implies  that,  to  save  storage  space,  the  exponents  of  each  block 
should  be  stored  in  their  respective  AY  storage  rather  than  with 
the  integrand. 

I .  Floating  Point  Post-Multiplication 

A  problem  arises  when  the  floating-point  output  of  one  integral 
is  used  as  the  independent  variable  of  another.  If  the  exponent  asso¬ 
ciated  with  the  independent-variable  increment  is  greater  than  zero, 
multiple  integrating  cycles  are  required;  if  the  exponent  is  less  than 
zero,  fractional  cycles  are  necessary.  Most  systems,  including  the 
machine  here  proposed,  however,  allow  only  one  cycle  per  iteration. 

Let  us  first  consider  floating  point  post-multiplication.  As  all 
other  quantities  in  the  system,  the  multiplier  is  a  floating-point 
number  such  that  the  absolute  value  of  the  mantissa  is  less  than  unity. 
The  integral  increment,  however,  is  a  floating-point  number  with  a  man¬ 
tissa  of  absolute  value  unity  or  zero,  where  the  sign  of  the  mantissa 
is  predictable,  as  described  for  the  fixed-point  system.  Multiplica¬ 
tion  of  two  floating-point  numbers  is  achieved  by  multiplying  the  man¬ 
tissas,  adding  the  two  exponents,  and  rescaling  the  result  into  standard 


I 


form.  In  our  case,  however,  the  result  does  not  need  to  be  rescaled 
because  the  exponent  is  adjusted  after  transmission  of  the  result  to 
the  appropriate  storage  locations.  Multiplication  of  the  integral 
output  mantissa  may  therefore  occur  simultaneously  with  the  integrat¬ 
ing  cycle  and,  because  both  exponents  are  known  at  the  beginning  of  the 
integrating  cycle,  post-multiplication  in  a  floating-point  system  nay 
be  simultaneous. 

As  shown  in  Section  D,  whenever  an  integral  output  is  used  as  the 
independent-variable  input,  the  resultant  function  can  be  considered  as 
a  post-multiplication.  In  these  cases,  the  integrating  cycle  is  iden¬ 
tical  to  the  one  described  above  except  that  the  output  exponent  is  the 
sum  of  the  dependent-  and  independent -variable  increment  input  exponents. 
No  storage  is  necessary  for  the  independent-variable  input  exponent  be¬ 
cause  it  is  retransmitted  for  every  iteration. 
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Chapter  VI 


THE  MULTI -MODULE  SYSTEM 


A .  Processing  Methods 

Assuming  that  we  have  a  differential  equation  requiring  m  separate 
integral  calculations  per  iteration,  we  can  visualize  the  equation  as  m 
points  in  a  two-dimensional  space,  each  representing  one  integral  func¬ 
tion.  To  solve  the  equation,  the  m  points  must  be  interconnected  ac¬ 
cording  to  the  solution  diagram. 

Now,  any  DDA  machine  can  be  represented  pictorially  as  a  matrix  of 

n  points,  each  representing  one  integrator  and  each  having  two  inputs 

and  one  output.  If  we  assume  full  parallel  operation,  where  all  n 

points  have  their  own  processor,  then,  after  each  processing  cycle,  up 

2 

to  n  signals  must  be  routed  to  a  maximum  of  n  + 2n  destination  points. 
This  is  based  on  the  assumption  that  each  integral  output  can  be  used  as 
the  independent  variable  of  two  other  integrals  and  as  a  component  of  the 
dependent-variable  input  of  all  integrals  including  itself.  This  is  not 
a  practical  scheme  for  any  method  other  than  patchboard  programming.  The 
complete  parallel  machine,  therefore,  is  not  suitable  under  the  conditions 
described  in  Chapter  III. 

Let  us  now  examine  the  other  extreme  and  assume  that  a  single  proces¬ 
sor  is  available.  This  processor  then  operates  each  integral  in  some 
predetermined  order  and,  after  each  processing  cycle,  one  signal  must  be 
routed  to  as  many  as  n+2  points.  The  required  interconnection  becomes 
simple  enough  to  be  handled  automatically  with  a  reasonable  amount  of 
hardware.  As  n  becomes  large,  however,  the  iteration  and  solution 
times  increase  proportionally  to  n.  This  does  not  satisfy  the  condi¬ 
tion  of  high  solution  speed  discussed  in  Chapter  II  and,  therefore,  is 
not  suitable. 

Between  these  two  extremes  is  the  alternate  method  of  serial-paral¬ 
lel  processing.  If  we  consider  the  matrix  of  n  points  to  be  divided 
into  m  smaller  submatrices  of  £  points  each  and  if  we  have  m  pro¬ 
cessors  (one  for  each  l  point  matrix),  then  we  have  serial  processing 
within  each  submatrix  but  we  may  process  all  submatrices  in  parallel  with 
each  other.  The  iteration  time  would  increase  as  the  number  of  required 
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integrals  increases  but  could  not  exceed  z  integrating  cycles.  Within 
each  submatrix,  one  signal  must  be  capable  of  being  routed  to  as  many  as 
(J5+2)  inputs  after  each  integrating  cycle.  The  serial-parallel  process¬ 
ing  approach,  therefore,  simplifies  the  interconnection  problem  within 
each  modulo  by  limiting  the  number  of  signals  to  be  routed  at  any  one 
time  to  one  signal  and  by  limiting  the  number  of  destinations  to  which 
this  signal  is  to  be  routed  to  (jj  +  2).  This  approach,  however,  does 
introduce  the  problem  of  interconnection  between  matrices. 


B.  Inter-Module  Communication  Methods 

Here  again  the  practical  solution  is  a  compromise  between  the  two 
extremes  of  total  interconnect ibility,  where  every  point  of  every  matrix 
can  be  connected  to  all  other  points  of  all  other  matrices  £md  where  no 
interconnection  exists  between  the  matrices.  The  first  becomes  impracti¬ 
cal  because  of  the  very  large  expense  in  hardware  and  time.  The  second 
extreme  is  unacceptable  because  it  breaks  the  system  into  separate  ma¬ 
chines  that  cannot  communicate  with  each  other  and  therefore  restricts 
the  complexity  of  problems  that  can  be  solved  to  z\  "complexity"  refers 
here  to  the  number  of  integrals  required  for  the  equation  solution,  which 
depends  .jointly  o.)  the  order  and  degree  of  the  equation.  The  alternative 
is  to  introduce  restrictions  and  limitations  that  will  simplify  the  hard¬ 
ware  requirements  and  still  allow  a  reasonable  end  sufficient  level  of 
interconnections  between  matr.'ces.  Two  possible  basic  approaches  are 
"vertical  communication"  and  "horizontal  communication." 

1 .  Vertical-Communication  Approach 

For  vertical  communication,  all  matrices  are  stacked  such  that, 
if  all  points  in  each  matrix  are  labeled  1  through  Z,  each  point  i  (1< 
i  <  z)  in  every  matrix  is  directly  below  and  above  the  points  i  in  the 
vertically  adjacent  matrices.  Assuming  total  communication  within  each 
plane,  we  now  allow  each  point  i  to  communicate  with  the  points  i  di¬ 
rectly  above  and  below.  This  communication  scheme  in  itself  is  too  re¬ 
strictive  to  be  useful;  however,  by  using  communication  channeling,  any 
point  can  communicate  with  any  other  point  in  any  plane.  "Communication 
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channeling"  here  means  the  duplication  at  some  point  in  the  plane  of  a 
function  being  generated  at  another  point  such  that  the  output  can  be 
communicated  to  the  required  destination  point. 

Figure  22  is  an  example  of  communication  channeling  in  the 
vertical-communication  scheme.  For  point  A  to  communicate  with  point 
B,  we  require  the  duplication  of  A  at  A2.  It  is  important  to  note 
that  this  communication  channel  is  unidirectional.  Vertical  communica¬ 
tion  may  be  advantageous  for  some  problems  (such  as  weather-prediction 
calculations)  but,  in  most  cases,  will  lead  to  an  excessive  amount  of 
function  duplications  and  hardware. 


Fig .  22 .  CHANNEL¬ 
ING  IN  VERTICAL 
COMMUNICATION. 


2.  Horizontal-Communication  Approach 

For  horizontal  communication,  all  matrices  are  arranged  in  a 
row.  Each  matrix  contains  two  sets  of  points  (set  A  and  set  B)  such 
that  each  point  in  set  A  (of  matrix  n)  can  communicate  with  every 
point  in  set  B  contained  in  the  matrix  (n-1),  and  all  points  in  set 
B  of  matrix  n  can  communicate  with  every  point  in  set  A  of  matrix 
(n  + 1),  as  indicated  in  Fig.  23.  The  necessary  hardware  requirements 
for  this  inter-matrix  communication  scheme  depends  directly  on  the  num¬ 
ber  of  points  contained  in  sets  A  and  B.  As  the  size  of  the  sets  A 
and  B  is  reduced,  however,  there  will  be  points  within  each  matrix 
which  are  not  contained  in  either  set  and,  therefore,  cannot  communi¬ 
cate  directly  with  points  in  other  matrices.  Communication  channeling 
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Fig.  23.  HORIZONTAL  COMMUNICATION. 

is  required  in  any  case  to  allow  all  points  to  communicate  with  each 
other  unless  sets  A  and  B  in  each  matrix  are  identical  and  all  points 
within  the  matrices  are  contained  in  the  sets .  Figure  24  is  an  example 
of  the  use  of  channeling  to  permit  communication  between  points  C  and 
D  in  two  adjacent  matrices.  Again  there  is  a  duplication  at  point  C2 
of  the  function  generated  at  C. 


Fig.  24.  CHANNELING  IN  HORIZONTAL 
COMMUNICATION. 
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Chapter  VII 


THE  MODULE 

A.  Processing  Methods 

The  proposed  machine  will  be  a  modular  system.  Each  module  is  to 
be  complete  and  capable  of  operating  indei>endently  or  in  conjunction 
with  other  modules.  Each  must  contain  all  the  necessary  memory  for 
program  and  data  storage  for  some  fixed  number  of  different  integrals, 
as  well  as  the  required  arithmetic  capability.  The  module  must  be  au¬ 
tomatically  programmable,  and  it  must  be  capable  of  repetitive  opera¬ 
tion  and  solution  reversal  (if  the  function  is  reversible). 

The  above  conditions  influence  the  selection  of  the  construction 
parameters.  The  two  most  important  of  these  are  the  method  of  process¬ 
ing  (serial  or  parallel)  and  the  method  of  intra-module  increment  com¬ 
munication. 

The  methods  of  parallel  and  serial  processing  have  been  described 
in  the  previous  chapter  in  the  context  of  interconnectibility .  It  was 
shown  to  be  advantageous  to  use  parallel  processing  between  the  modules 
but  to  use  serial  processing  within  each  module.  The  principal  disad¬ 
vantage  of  serial  processing  is  that  the  iteration  time  increases  as 
the  complexity  of  the  problem  increases;  the  principal  disadvantage  of 
parallel  processing  is  that  the  difficulty  of  integrator  interconnec¬ 
tion  prevents  any  practical  automatic  programming.  The  primary  advan¬ 
tages  of  serial  processing  within  the  module  are  economy  of  hardware 
and  interconnectibility.  The  serial-processing  module  requires  only  a 
single  arithmetic  processor.  The  number  cf  integrals  that  can  be  solved 
within  the  module  is  limited  only  by  the  availability  of  memory  space 
for  program  and  data  storage.  Because  only  one  integral  is  processed 
at  a  time,  a  single  set  of  inputs  and  outputs  must  be  routed  between 
the  integrating  cycles.  This,  in  contrast  to  parallel  processing,  al¬ 
lows  for  completely  automatic  programming. 
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B.  Intra -Module  Communication  Methods 

The  DDA  program  is  a  digital  model  of  the  system  under  study.  With 
few  exceptions,  therefore,  the  programs  are  closed-loop  systems  with  all 
functions  generated  internally,  and  all  inputs  to  the  integrators  are 
derived  from  the  outputs  of  other  integrators.  The  exceptions,  of  course, 
are  those  instances  when  the  machine  is  used  for  real-time  applications; 
in  these  cases,  one  or  more  externally  generated  functions  are  entered 
as  integrator  inputs.  To  communicate  the  various  outputs  to  the  desired 
inputs,  it  is  necessary  to  employ  at  least  temporary  storage  for  the  sig¬ 
nals.  We  may  choose  to  store  the  outputs  and  then,  at  the  beginning  of 
each  integrating  cycle,  to  fetch  the  various  required  outputs  which  may 
combine  to  form  the  dependent-variable  input  and,  unless  machine  time  is 
used,  to  fetch  the  output  that  is  used  as  the  independent-variable  input. 
Alternately,  we  may  choose  to  transmit  each  output  as  it  becomes  avail¬ 
able  and  continuously  update  and  store  all  the  input  functions.  Th;  -.e 
two  methods  are  called  "output  storage"  and  "input  storage, "  respectively 

1 .  Function  Output  Storage 

Function  output  storage  requires  a  minimum  of  memory .  Each 
functional  block  has  two  inputs;  one  is  the  dependent-variable  input  and 
is  generated  by  combining  several  outputs  and  therefore  requires  full- 
word  storage  in  contrast  to  the  single-bit  plus  sign-bit  that  is  neces¬ 
sary  for  the  storage  of  the  output.  At  the  beginning  of  each  integrating 
cycle,  assuming  that  all  outputs  are  stored,  one  output  is  fetched  to  be 
used  as  the  independent-variable  input  unless  the  machine  time  is  used. 
All  outputs  necessary  for  the  dependent-variable  input  are  then  fetched 
and  the  input  is  formed  in  some  arithmetic  unit.  Although  its  operation 
is  always  a  summing  of  the  various  outputs,  the  arithmetic  unit  must  con¬ 
sist  of  a  network  of  adders  such  that  all  outputs  can  be  summed  simulta¬ 
neously,  or  a  considerable  amount  of  time  must  be  allocated  to  generate 
this  input.  Assuming  simultaneous  summing,  all  the  required  outputs 
must  be  available  simultaneously.  This  can  be  achieved  by  using  a  se¬ 
lection  matrix,  but  again  this  requires  a  considerable  amount  of  hard¬ 
ware.  The  program  storage  necessary  for  the  selection  codes  is  n-(n+ 
log  n)  bits,  where  n  is  the  number  of  functional  blocks  in  the  module 

4 


2. 


Function  Input  Storage 


Function  input  storage  requires  memory  space  for  two  inputs 
for  each  functional  block;  as  stated  above,  one  of  these  is  a  full  word. 

At  the  beginning  of  the  integrating  cycle,  both  inputs  are  immediately 
available;  at  the  conclusion  of  the  cycle,  the  output  is  routed  to  the 
appropriate  independent-variable  storage  locations .  The  output  also  is 
routed  to  be  used  to  update  all  appropriate  dependent-variable  inputs. 

One  method  is  to  employ  a  separate  counter  for  each  of  these  inputs;  the 
output  then  updates  the  counters  by  either  adding  or  subtracting  one  unit 
increment.  The  contents  of  the  counters  therefore  are  the  dependent- 
variable  inputs  and  are  always  current.  An  alternative  method  would  be 
to  employ  a  single  adder  and  to  fetch  and  update  all  selected  inputs  se¬ 
quentially.  Although  the  first  method  requires  considerably  more  hard¬ 
ware,  it  is  preferred  because  the  second  method  requires  too  much  time. 
Using  counters  simplifies  the  selection  network  by  allowing  count-enable 
lines  to  be  directly  tied  to  the  proper  section  of  the  program  word.  In 
this  case,  the  program  storage  necessary  for  the  selection  codes  is  n-(n  + 
2  log  n)  bits,  where  n  again  is  the  number  of  functional  blocks  within 
the  module. 

Input  storage,  unlike  output  storage,  allows  easy  expansion  of 
the  module  to  a  larger  number  of  functional  blocks  with  the  increase  in 
hardware  being  directly  proportional  to  the  number  of  added  blocks. 


C .  Memory  Organization 

If  we  assume  that  for  each  integral  (and  integral  remainder  word 
stored)  we  hr.ve  one  corresponding  integrand  word,  then,  because  each 
cycle  occurs  in  the  same  time  interval  during  each  iteration,  it  is 
possible  to  organize  the  data-memory  block  as  a  push-down  stack.  This 
considerably  reduces  the  memory  necessary  for  program  storage  by  elimi¬ 
nating  the  address  portion  from  each  program  word.  This  trade-off,  how¬ 
ever,  may  lead  to  duplication  and  multiple  s*"  of  the  same  identical 
integrand  word. 

A  similar  situation  exists  for  the  storage  of  post-muiiip1 Nation 
factors,  but  we  have  one  additional  consideration.  Post-multiplication 
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by  a  constant  B,  as  described  earlier,  is  unlikely  to  result  in  much 
duplication  if  the  push-down  stack  organization  is  used.  If,  however, 
we  maintain  random-access  organization  for  both  the  integrands  and  the 
post-multiplication  factors,  it  is  possible  to  expand  the  post-multipli¬ 
cation  concept  to  include  function  multiplication. 

If  we  assume  a  module  size  which  allows  the  generation  of  m  inte¬ 
grals,  each  with  a  different  dependent  and  independent  variable,  then 
that  module  can  solve  a  linear  constant -coefficient  differential  equa¬ 
tion  of  order  m.  In  the  case  where  random-access  organization  is  main¬ 
tained  for  all  data,  the  coefficients  may  be  varying  functions  provided 
that  they  have  already  been  generated  in  the  course  of  the  equation  solu¬ 
tion.  If  any  of  the  functions  must  be  generated  separately  or  if  the 
order  of  the  equation  exceeds  m,  two  or  more  modules  can  be  cascaded 
for  horizontal  communication  or  they  may  be  stacked  for  vertical  commu¬ 
nication  to  provide  the  increased  capacity  necessary  for  the  solution. 

The  number  of  integrals  required  to  solve  nonlinear  differential  equa¬ 
tions  depends  on  order  as  well  as  +he  degree  of  nonlinearity,  here 
jointly  referred  as  the  complexity  of  the  equation. 
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Chapter  VIII 


THE  PROPOSED  MACHINE 


A.  Operating  Procedure 

The  system  here  proposed  consists  of  one  or  more  identical  modules 
which  are  self-contained  and  may  operate  jointly  on  the  same  problem  or 
individually  on  separate  problems .  Figure  25  is  a  block  diagram  of  one 
module . 


Fig.  25.  BLOCK  DIAGRAM  OF  A  MODULE. 


The  modules  follow  a  cyclic  operating  procedure.  During  each  iter¬ 
ation,  all  integrals  are  processed  sequentially  for  one  integ^l  incre¬ 
ment.  The  order  of  processing  is  determined  by  the  program  and  is  repeated 
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for  each  iteration.  The  processing  of  a  single  integral  increment  is 
called  a  cycle  and,  in  this  system,  a  cycle  consists  essentially  of  the 
following  operations: 


(1) 

fetch  the  integrand  increments 

(2) 

pre-multiplication 

(3) 

incrementation  of  the  integrand 

(4) 

fetch  the  independent  variable  increment 

(5) 

generation  of  the  integral  increment  and  the  new 
remainder 

integral 

(6) 

post-multiplication 

(7) 

storage  of  the  integral  increment  as  dependent- 
pendent  -variable  incremen+s,  as  required 

and  inde- 

Figure  26  is  a  conceptual  flow  chart  of  the  module  operation.  The 
actual  time  sequence  of  operations  does  not  always  follow  this  pattern 
because  of  the  considerable  amount  of  parallelism  and  overlapping  of 
functions.  Post-multiplication,  for  example,  is  initiated  at  the  same 
time  as  pre-multiplication  and  is  completed  before  the  integral  incre¬ 
ment  AZ  is  available.  The  result  of  post-multiplication  then  is  held 

until  the  AZ  generation  is  completed,  whereupon  we  either  store  or 
discard  the  result  depending  on  the  magnitude  of  AZ. 


B.  Communication  within  the  Module 

The  proposed  machine  will  employ  input  variable  storage.  If  AW 
t  h  t  h 

is  the  i  increment  output  of  the  k  integral  function,  then  in 
terms  of  problem  variables  each  integral  increment  AW^  can  be  useu 

as  tne  dependent-variable  increment  AV  .  ,  where  j  =  i  +  1  if  k  >  £ 

J 

and  j  =  i  if  k  <  b,  for  any  or  all  integrands  in  the  module  includ¬ 
ing  the  case  where  k  =  £ . 

The  increments  AW  can  also  be  used  as  the  independent-variable 
ik 

increment  AX.  for  any  two  integrals  (again  j  =  i  +  l  if  k  >  ,0  and 
J  H 

j  =  i  if  k  <  £).  Each  dependent-variable  increment  AV^  may  be  a 
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Fig.  26.  FLOW  CHART  FOR  MODULE  OPERATION. 


single  AW  or  may  be  formed  by  the  summation  of  any  or  all  AW  . 

j  a  j  f 

Within  each  module,  therefore,  we  allow  total  communication  for  the 


dependent  variable.  The  limitations  on  the  independent  variables  are 
not  as  severe  as  may  appear.  If  some  integral  increment  must  be  used 
as  the  independent-variable  increment  of  more  than  two  integral  func¬ 
tions,  one  simply  has  to  generate  one  integral  twice.  In  addition,  the 
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restriction  can  be  relaxed  or  even  be  eliminated  although  this  would 
probably  not  be  desirable.  The  dependent-variable  increments  are  up¬ 
dated  for  all  integrals  during  every  cycle,  and  the  same  is  true  for 
the  independent  variable.  Thus  all  inputs  to  any  integral  are  ready 
and  immediately  available  at  the  initiation  of  any  cycle. 

The  memory  organization  is  such  that  random  access  is  maintained 
for  both  the  integrands  and  post-multiplication  factors.  The  machine, 
therefore,  is  capable  of  simultaneous  integration  and  multiplication 
by  either  a  constant  or  a  variable . 


C .  Communication  between  Modules 

If  a  multi-module  system  is  required,  inter-module  communication 
is  achieved  by  the  "horizontal  communication"  method.  All  modules  are 
processed  in  parallel. 

The  required  hardware  connections  between  cascaded  modules  consist 
of  two  sets  of  increment-corn lunicat ion  links  between  any  two  modules. 
These  links  are  unidirectional. 

The  communication  between  modules,  while  not  as  general  and  com¬ 
plete  as  within  each  module,  allows  for  any  or  all  elements  of  the  set 
of  the  four  integral  increments  AW^  (i  =  m  -  3,  m  -  2,  m  -  1,  m)  of  each 
module  n  to  be  used  as  the  dependent-variable  increments  for  any  or 
all  AV^  (i  =  1,  2,  3,  4)  or  as  components  thereof,  of  each  module  n  +  1. 
Also,  any  or  all  elements  of  the  set  of  the  four  integral  increments 
(i  =  1,  2,  3,  4)  of  each  module  n  can  be  used  as  the  dependent- 
variable  increments  for  any  or  all  AV^  (i  =  m  -  3,  m  -  2,  m  -  1,  m)  or 
as  components  thereof,  of  each  module  n-1. 

A  reasonable  and  convenient  size  for  m  is  16  which  means  that  a 
single  module  is  capable  of  solving  equations  up  to  the  16th  ordev.  pro¬ 
vided  that  the  coefficients  are  already  generated  in  the  problem  solution 
or  that  t  ley  are  constants.  For  many  applications  this  should  be  more 
than  sufficient.  If  a  greater  capability  is  iesired,  one  has  only  to 
cascade  additional  nodules.  In  Fig.  27,  each  integral  is  represented 
as  a  point  in  a  4x4  matrix  and  each  square  plane  of  16  points  repre¬ 
sents  a  module.  Any  point  enclosed  by  a  solid  line  has  total  communica¬ 
tion  with  all  other  points  within  that  enclosure  (within  the  module), 
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•  i  •  • 
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MODULE  n  +  l 


| _ 1  MODULE  n 


MOOULE  n-l 


Fig.  27.  MATRIX  SHOWING  COMMUNICATION  BETWEEN  MODULES. 


and  any  point  enclosed  by  a  broken  line  has  total  communication  with 
all  other  points  within  the  broken-line  enclosure  (external  to  the 
module ) . 

Each  enclosure  then  is  a  domain  of  total  communication  for  each 
point  within  it,  and  each  point  belongs  to  at  least  one  but  not  more 
than  two  domains.  Two  points  which  are  not  contained  in  the  same  do¬ 
main  cannot  communicate  directly.  Indirect  communication  is  achieved 
by  channeling.  (For  example,  points  A  and  D  in  Fig.  27  communicate 
via  points  B  and  C,  and  points  A  and  F  communicate  via  points 
B,  C,  and  E.)  Other  communication  schemes  are  possible  and  some  mod¬ 
ifications  of  the  above  are  being  considered;  however,  any  method  must 
by  necessity  use  some  kind  of  channeling  to  keep  the  basic  organization 
simple . 

If  a  particular  integral  output  is  to  be  used  in  an  adjacent  module, 
the  output  increment  is  transmitted  to  that  module  immediately  after  gen¬ 
eration  and  is  held  there  temporarily  until  the  next  integrating  cycle 
has  been  initiated.  During  each  integrating  cycle,  a  time  slot  is  re¬ 
served  to  store  the  external  inputs  received  during  the  previous  cycle. 
Because  only  one  input  can  arrive  during  any  one  cycle,  the  routing  of 
that  signal  is  simplified  considerably;  furthermore,  the  interleaving 
of  internal  and  external  signal  storage  enables  us  to  store  the  address 
of  the  signal  in  the  destination  module  rather  than  in  the  source  module. 
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This  saves  both  hardware  and  time  because  the  address  does  not  need  to 
be  transmitted  along  with  the  signal  between  modules. 


D.  External  Function  Input 

To  use  the  machine  for  real-time  control,  it  is  necessary  to  pro¬ 
vide  for  real-time  function  inputs.  The  communication  links  available 
to  cascade  modules  to  each  other,  if  not  required  to  connect  to  another 
module,  can  be  used  to  enter  external  functions.  A  module  therefore  is 
capable  of  accepting  up  to  eight  external  function  inputs.  As  modules 
are  cascaded  into  larger  systems,  only  the  modules  at  either  end  of  the 
string  can  accept  four  external  inputs  each  in  this  way.  These  external 
inputs  also  are  circuited  to  be  used  for  specific  integrals  (cycle  num¬ 
bers)  within  the  modules.  Additionally  required  external  functions  must 
use  separate  links  going  to  those  integrals  only,  which  in  Fig.  27  are 
represented  as  points  belonging  to  a  single  domain. 


E .  Iteration  Time 

Generally,  as  the  complexity  or  the  order  of  the  equations  increases, 
the  solution  time  increases  proportionally,  which  is  true  in  most  numeri¬ 
cal-solution  methods.  The  solution  time  per  iteration  required  for  the 
DIC  increases  linearly  with  increasing  problem  complexity  until  it  equals 
16  cycle  times  (for  m  =  16);  thereafter,  the  iteration  time  stays  con¬ 
stant  as  additional  integrations  are  performed  simultaneously  in  adjacent 
modules.  Thus  the  solution  time  per  iteration  for  the  DIC  cannot  exceed 
the  time  required  to  process  16  integrations,  regardless  of  the  problem 
size . 

Although  once  established,  the  order  of  processing  remains  the  same 
for  every  iteration;  that  order  can  be  selected  during  program  setup  to 
allow  easy  communication  to  other  modules  without  much  furotion  duplica¬ 
tion  or  to  reduce  the  iteration  time  by  distributing  one  problem  over  two 
or  more  modules.  In  examples  1  and  2  in  Chapter  II,  a  total  of  six  inte¬ 
gral  functions  are  required  each  if  post-multiplication  is  not  used;  there 
fore,  the  iteration  time  is  equal  to  six  cycle  times.  We  could,  however, 
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distribute  the  integral  functions  over  the  two  modules  n  and  n  +  1 
such  that  each  module  contains  three  function?.  The  iteration  time  in 
this  case  would  be  equal  to  three  cycle  times  or  one-half  of  the  time 
required  previously. 


F.  The  Processor 

The  processor  of  the  proposed  machine  is  described  by  the  equations 
in  chapter  V,  which  were  written  for  rectangular  integration.  However, 
the  processor  contains  one  additional  element  which  allows  a  choice  of 
several  integrating  algorithms.  Equation  (5.21), 


R. 

1 


=  R 


i-1 


Y^i 


determines  the  algorithm.  This  equation  can  be  rewritten  in  more  gen¬ 
eral  form  as 


R,  =  ,  +  [y,  ,  +  (K  +  *>  AY,  I  AX4  "  AZ.  (8.1a) 

i  i-1  l  i-1  ij  i  i 

R  =  Ri_1  +  Y1/tf1  +  K  AYjLAX1  -  AZt  (8.1b) 


where  K  is  a  constant  which  determines  the  algorithm  being  used. 

Four  simple  and  very  easy-to-implement  algorithms  are  obtained  by 
selecting  K  to  be  0,  +1/2,  -1/2,  or  -1.  If  K  =  0,  we  have  the  pre¬ 
vious  case  of  rectangular  integration.  Setting  K  =  +1/2  results  in 
the  modified  trapezoidal  integration  rule  which  is  an  extrapolating  al¬ 
gorithm.  In  this  case, 


or 


=  Ri-1  + 


Yi^i  +  2  *Yi^i 


AZ . 


(8.2a) 


Vl  + 


(8.2b) 
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The  interpolating  trapezoidal  algorithm  requires  that  K  =  -1/2.  Then, 


R. 

1 


Y.AX.  -  -  AY  AX.  -  AZ. 
1  1  2  i  1  l 


(8.3) 


The  fourth  algorithm  allows  the  proper  multiplication  of  two  variables 
if  K  is  set  to  -1  in  one  of  the  cycles  and  to  0  for  the  other.  Sum¬ 
ming  the  two  outputs  and  ignoring  the  remainders  result  in 


A(X.Y. )  =  Y.  ,AX.  +  X.AY. 
l  l  l-l  l  ii 


(8.4a) 


or 


A(X.Y. )  =  Y.AX.  +  X.  AY . 

ii  ii  l-l  l 


(8.4b) 


Both  of  these  equations  represent  the  exact  product.  It  should  be  noted 
that  the  exact  product  also  can  be  achieved  by  using  the  interpolating 
trapezoidal  algorithm  (K  =  -1/2)  for  both  functions. 

To  generate  the  new  remainder  R  and  the  output  AZ,  we  must 
first  update  the  integrand  Y.  This  time  slot  can  be  used  to  perform 
the  addition  of  K  AY^AX^  to  the  old  remainder.  Each  integrating  cycle 
ihen  has  the  following  three  phases. 

Phase  1:  Pre-multiplication 

AY.  =  A  AV  (8.5) 

i  i 

Phase  2:  Update  the  integrand  and  modify  the  remainder, 
depending  on  the  choice  of  algorithm 

Yi  =  Y  +  AYt  (8.6) 

R*  =  R  x  +  K  AY.^i  (8.7) 


Phase  3:  Generate  the  integral  output  and  new  remainder, 
and  simultaneous  post-multiplication 


(8.8) 


Ri  =  Ri  +  Yi^i  "  iYZi 


R2i  =  R2i-1  +  B  ‘  AWi 


(8.9) 


During  the  processing,  only  phase  1  and  Eq.  (8.6)  depend  on  the  value 
of  AX;  therefore,  whenever  AX  =  0,  Eq.  (8.7)  and  all  of  phase  3  do 
not  effect  the  outcome  of  the  calculation  and  therefore  may  be  deleted. 
Normally,  this  case  occurs  quite  frequently  unless  is  supplied  by 

the  machine  time.  Hence,  to  increase  the  solution  speed  further,  the 
machine  will  operate  semi-asynchronously . 

Although  all  operations  occur  in  the  same  time  slot  during  each 
cycle,  execution  of  Eq.  (8.7)  during  phase  2  is  prevented  and  the  cycle 
is  terminated  after  this  phase  if  AX  =  0,  provided  that  the  equation 
to  be  solved  is  contained  in  a  single  module  and  that  no  real-time  ex¬ 
ternal  signals  are  used.  In  real-time  problems  and  if  the  problem  is 
distributed  over  several  modules,  the  processing  operation  must  be  syn¬ 
chronous  . 


G.  Programming  and  Interface 

Since  the  system  is  completely  modular,  the  programming  is  identi¬ 
cal  for  each  module;  therefore,  only  one  module  will  be  considered  here. 
The  module  may  be  programmed  automatically  from  a  general-purpose  compu¬ 
ter  containing  the  appropriate  software-package  or  it  may  be  programmed 
manually.  In  either  case,  all  programming  steps  are  identical  and  must 
be  entered  in  the  same  order. 

For  the  simplified  machine  which  allows  simultaneous  post-multipli¬ 
cation  by  a  constant  but  not  by  a  variable  (this  eliminates  the  indirect- 
addressing  step  prior  to  post-multiplication),  the  program  contains  the 
following  inf onnat'' 'n : 

(1)  number  of  integral  cycles 

(2)  integral  address  (cycle  no.) 

(3)  initial  condition  of  the  integer  (Y) 
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(4)  initial  condition  of  first  remainder  (R) 

(5)  pre-multiplication  factor  (A) 

(6)  post-multiplication  factor  (+/-, PY) 

(7)  initial  condition  of  second  remainder  (PR) 

(8)  positive  or  negative  independent -variable  input,  either 
machine  time  or  transmitted  signal  (+/-,T) 

(9)  addresses  of  dependent -variable  inputs  to  which  output 
is  to  go  (internal  to  module)  (DYA) 

(10)  addresses  of  independent- variable  inputs  to  which  out¬ 
put  is  to  go  (internal)  (D3GO 

(11)  is  integral  increment  to  be  used  for  adjacent  module 
(I/E) 

(12)  is  integral  increment  used  as  output  (0) 

(13)  addresses  of  d^nendent  variable  to  which  external  input 
is  to  go  (DYAE ) 

Steps  2  through  13  are  repeated  for  each  integral  cycle.  Entering  the 
options  of  solution  range  (number  of  iteration  cycles),  repetitive  oper¬ 
ation,  or  solution  reversal  completes  the  program. 

All  this  information  is  entered  in  binary  form  and  is  automatically 
cycled  to  the  appropriate  memory-storage  locations.  As  an  example,  let 
us  consider  Ea.  (2.15)  in  Chapter  II: 

■2 

yy  +  y  +1=0 
or 


dy 


-y 


dy  dx 

y  y 


(8.10) 


The  solution  diagram  is  repeated  for  convenience  in  Fig.  28.  Assuming 
the  module  is  operating  in  conjunction  with  a  general-purpose  computer 
containing  the  translator,  the  following  input  statements  are  required 
to  generate  the  DIC  program: 
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The  first  two  statements  are  the  declarations  of  the  independent  and 
dependent  variables,  respectively.  Following  the  declarations  is  a 
list  of  all  coupled  equations  to  be  solved.  The  form  of  the  equation 
statement  is  very  similar  to  that  of  Eq .  (8.10),  but  the  dots  over  the 
variables  are  replaced  by  a  numeral  to  signify  the  order  of  the  deriva¬ 
tives  (dy  becomes  DY2).  FIN  indicates  the  end  of  the  equation  list. 
If  other  equations  (uncoupled  from  the  previous  set)  are  to  be  solved 
simultaneously,  the  FIN  statement  is  replaced  by  NEXT  and  is  followed 
by  the  new  declarations.  Except  for  the  operating  options,  the  complete 
DIC  program  for  Eq .  (8.10)  is  shown  in  Table  6.  Again  the  assumption  is 
that,  at  most,  16  bits  are  to  be  used  and  all  initial  conditions  are  nor¬ 
malized  . 

Table  6 

DIC  PROGRAM  FOR  dy  =  -y  ^  -  — 

y  y 

Number  of  cycles  =  6,  and  selection  codes  are  always  (0/1). 


Cycle 

(No.) 

■ 

R 

A 

+/-.PY 

PR 

+/-,T 

DYA 

DXA 

I/E 

0 

DYAE 

1 

1.00 

0.50 

4 

0,1.00 

0.50 

0,1 

6 

2 

0 

0 

— 

2 

1.00 

0.50 

4 

0,1.00 

0.50 

0,0 

— 

3,5 

0 

0 

— 

3 

1.00 

0.50 

4 

1,1.00 

0.50 

0,0 

2,3,4 

— 

0 

0 

— 

4 

0.25 

0.50 

1 

1, 1.00 

0.50 

0,1 

1,5 

— 

0 

0 

— 

5 

1.00 

0.50 

4 

1, 1.00 

0.50 

0,0 

1,5 

— 

0 

0 

— 

6 

0.25 

0.50 

1 

0, 1.00 

0.50 

0,1 

— 

— 

0 

1 

— 

The  combined  GPC-DIC  system  requires  an  interface  to  channel  commu¬ 
nications  of  input,  output,  and  commands,  and  to  match  the  data  rates  of 
the  two  machines.  To  the  GPC  the  interface  will  appear  as  an  I/O  chan¬ 
nel  and  the  DIC  as  some  I/O  device.  Therefore,  once  the  translator  has 
set  up  the  program  and  it  has  been  transmitted  to  the  DIC,  the  GPC  is 
free  to  execute  other  programs.  An  additional  function  of  the  interface 
is  to  act  as  an  exit  port  for  the  immediate  printing  or  display  of  DIC 
problem  solutions.  Figure  29  is  a  block  diagram  of  the  combined  system. 
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TO  DISPLAY 


DATA 

external  functions 


Fig.  29.  BLOCK  DIAGRAM  OF  GPC-DIC  COMBINED  SYSTEM. 


Although  the  DIC  generates  data  at  an  extremely  high  rate,  it  is  not 
necessary  to  use  a  DMA  (direct  memory  access)  channel  since  typically 
the  solutions  are  read  out  only  every  hundred  or  even  thousand  itera¬ 
tions.  The  interface  will  contain  a  reasonably  large  temporary  storage 
(512  words)  to  allow  data  transfer  from  the  DIC  to  the  GPC  via  a  multi¬ 
bit  channel  under  program  control,  while  at  the  same  time  allowing  im¬ 
mediate  display  on  a  CRT  or  some  other  device. 


H.  Computer  Simulation 

The  basic  module  including  the  multiplication  functions  has  been 
simulated  on  the  Stanford  Computation  Center  IBM  System  360  model  67. 

The  program  was  written  in  FORTRAN,  and  Fig.  30  is  a  listing  of  the 
simulation.  The  input  data  are  in  essentially  the  same  order  and  form 
as  they  would  be  for  the  actual  hardware  machine,  but  the  initial  con¬ 
ditions  and  some  parameters  are  presented  in  decimal  form.  The  program 
execution  follows  exactly  the  steps  and  phases  (Section  F)  of  the  pro¬ 
posed  machine  with  the  exception  that  the  calculations  of  R2  (the  sec¬ 
ond  remainder)  and  the  final  output  increment  are  separated  to  save  com¬ 
puting  time.  The  simulation  uses  the  two-loop  number  system. 

After  program  read-in  and  verification,  all  variables  and  parameters 
are  initialized.  The  problem  solution  begins  with  the  first  iteration 


I  NTEGER  A,  B,  C,  OY,ABY,M,  N,  DX,  SR,  TR2,  ABR,  DZ1,  I  ,  K,  L,H,  ABPR,  CNT,  COUNT, 
Cl  TER  CYC  F 

INTEGER  Y(lb),R(i6),PY(16),PR(16),UD(16),UDA(16,16),  XC(  16,3),  PMAC  1 
CG,2),XA(16,2),DZY(lb) 

REAL  X,  CORRY, D,  NOPMX,  XMAX 
DIMENSION  YYC  16), MXC 16 ),  NORMYC  16) 

4000  FORMATC  '  1*  ,  1  NO  DATA  FOUND  FOR  INTEGRATOR  #  *  ,  I  3/ *  INITIAL  CONDITIO 
CNS  ARE  ASSIGNED  TO  INTEGRATORS  1  THROUGH  ',13///) 

4010  FORMATC  '  ,  3X,  I  6,  7X,  F7.  5,  5X,  6(  F7.  5,  3X) ) 

4020  FORMATC  ',//,'*W  A  R  N  I  N  G  *—',//,  'Y-VALUE  OF  INTEGRATOR  #',l 
C3, ' EXCEEOED  THE  MAXIMUM  VALUE  OF ', I o/ , 'OURI NG  ITERATION  #',I6/,'TH 
CE  VALUE  OF  Y  WAS  *,-',l6/,'Y  IS  RESET  TO  ' ,  I  6/  ,  6X,  '  —  *•*’  ) 

4030  FORMATC  ' ,  '  I  TERATI  ON'  ,  10X,  '  X'  ,  5X,  6  ( 4X,  '  YC  ,  I  2,  '  ) ' ,  IX) ) 

4040  FORMATC  '  ,  3X,  '  I  MTE- '  ,  2  X,  '  SUM-DY  INPUT  I  S'  ,  5X,  '  DZ-POST- ' ,  8X,  'DZ  GO 
CES  TO  DY',8X,'DZ  GOES  TO  DX'/,'  ',  2X, '  GRATOR'  ,  3X,  •  SHI  FTED  UP  BY', 3 
CX, 'MULTI  PLI  CATION', SX, 'OF  I  NTEGRATORS' ,  7X,  '  OF  INTEGRATORS',/) 

4050  FORMATC  '  ,  3X,  I  3,  8 X,  1 4,  10 X,  FI 0 . 7, 6 X,  1G<  I  2,  '  ,  '  ) ) 

40b0  FORMATC  '♦',68X,I3,  ',',13) 

COM  READ  IN  PROGRAM,  INITIAL  CONDITIONS  AND  PARAMETERS 
REAOC  5,  * )  I  TER,  XMAX,  NORMX,  CYC, M,  II,  COUNT 
CNT-COUNT-1 
DO  10  1-1,16 
UD( I ) -0 

10  READC  5,*,  END-2  0,  ERR-20)  MX(  I  ),  Y(  I  ),  R(  I  ),  PY(  I  ),  PRC  I  ),  (UDAC I  ,  J) ,  J-l, 
C1C ),  (  PM  AC  I  ,K),K— 1,2),  (XA(  I  ,K),K— 1,2),  (XCC  I,  L),  L— 1,  3) 

20  WRITEC6, 4000)1, 1-1 
WRI TEC  6, 4040) 

COM  CLEAR  AND  INITIALIZE  DY-STORAGE 

COM  PRINT  CONNECTION  TABLE  AND  PRE-  AND  POST-MULTIPLICATION 
COM  FACTORS  FOR  PROGRAM  VERIFICATION. 

DO  40  1-1, CYC 
NORMYC  I  )-M/MXC  I  ) 

K-l 

DO  30  J-l, 16 

I FCUDAC I , J) . NE. 1 )  GO  TO  30 

DZY  ( K)  -J 

K*K*1 

30  DZYC  K) *0 

WRI  TEC  6,  4050)  I,  PMAC  I  ,  1 ) ,  PY  C  I  )/(N*0.  ),  C  DZYC  J ),  J-l,  K) 

40  WRITEC6,4060)CXAC I, J), J-l, 2) 

WRITEC6,4030)CI,I-1,CYC) 

COM  START  ITERATIONS 
DO  120  H-l, ITER 

COM  NORMALIZE  X  AND  Y  FOR  OUTPUT 
DO  50  1-1, CYC 
50  YYC I ) -YC I  )/ NORMYC I  ) 

X-CH-D/NORMX 
IFCX.GT.XMAX)  GO  TO  120 
COM  OUTPUT  Y  IF  CALLED  FOR 
COM  DELETING  THE  FOLLOWING  STATEMENT  CAUSES 
COM  ALL  Y'S  (1-6) TO  BE  PRINTED  :  I F( PMAC I , 2 ) . EQ. 0 )  GO  TO  60 
CNT- CNT* 1 

IFCCNT.  LT. COUNT)  GO  TO  60 
WRITE(6,4010)H,X,  (YYC  I  ),  1-1,  CYC) 

CNT-0 

60  CONTINUE 
COM  START  CYCLE 

DO  110  1-1, CYC 

COM  FETCH  DY,  PRE-MULTI PLY,  CLEAR  DY-STORAGE 
DZ-0 
DZ1-0 

DY-UD' I ) -2  *  *  PMA (1,1) 

U0(l)-0 

COM  UPDATE  Y  AND  CHECK  FOR  Y-OVERFLOW 
YC I )»Y( I )*DY 
ABY-IABSCYC I )) 

IFCABY.LE.M)  GO  TO  70 
YC I ) -I  SI GNC ABY-M, YC I ) ) 

WRITEC6, 4020)  1, M,H,ABY,Y(I  ) 

70  CONTINUE 


Fig.  30.  FORTRAN  SIMULATION  PROGRAM  OF  DIC  MODULE. 
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COM  DETERMINE  SOURCE  OF  DX 
OX-XC(l,2) 

IF(DX.EQ.O)  OXBXC( 1,1) 

XC(l,l)-0 

A-l 

COM  PREOICT  INTEGRL.  INCREMENT  OUTCOME 

COM  AND  00  POSY-MULTIPLICATION,  STORE  POST-MULT.  OUTPUT  IN 
COM  TEMPORARY  STORAGE 
SR«I S I  GN(  A,  R(  I  ) ) 

TR2-PR<  1  )*SR*PYC  1 ) 

COM  CHECK  VALUE  OF  OX,  TERMINATE  CYCLE  IF  OX»0 
IFJOX.EQ.Q)  GO  TO  110 

COM  COMPUTE  REMAIMOEP  1  (R)  AND  INTGRL.  OUTPUT  0Z1 

COM  DISCARO  REMAINDEP.  2  AND  POST-MULTI  PL.  OUTPUT  IF  0Z1-0 

R(  I  )bR(  I  )♦  Y(  I  )*DX 
ABR» I AB3(R( I ) ) 

IF(ABR.LE.M)  GO  TO  80 
R( I )■! SI GN( ABR-M, R(  I  )) 

DZ1«SR 

COM  STORE  REMAINDER  2,  ASSIGN  FINAL  OUTPUT  INCREMENT 
PR<I)«TR2 
ABPR* I ABS( TR2 ) 

IF(ABPR.LE.N)  GO  TO  80 
PR(  I  )■!  SI  GN(  A6PR-N,  TR2) 

DZ»I SI GN( A, PR( I  )) 

80  CONTINUE 

COM  ROUTE  OUTPUT  OZ  TO  DESIRED  DY  INPUT  STORAGE  LOCATIONS 
DO  90  K-l.CYC 
C-UOA<l,K) 

IF(C.EO.O)  GO  TO  90 
UD(K)-UD(K)*OZ 
90  CONTINUE 

COM  ROUTE  OUTPUT  OZ  TO  OESIREO  DX  INPUT  STORAGE  LOCATIONS 
DO  100  Kal,  2 
D-XA<  1,K) 

IF(O.EQ.O)  GO  TO  100 

XC(D,1)-0Z 
100  CONTINUE 
110  CONTINUE 
120  CONTINUE 

COM  THE  FIRST  OATA  CARD  CONTAINS  (SEPERATED  BY  SPACE  OR  COMMA)  : 

COM  ITER  XMAX  NORMX  CYC  M  N  COUNT 

COM  THE  NEXT  OATA  CARO  MUST  BE  REPEATEO  FOR  EACH  CYCLE  USED 

COM  MX(  I  )  Y(  I  )  R(  I  )  PY(  I  )  PR(I)  UDA( , )  PMA(  )  XA(,)  XC(,) 

COM  Y, R,  PY,  PR  ARE  INTEGER  INITIAL  CONDITIONS 

COM  MX  IS  A  REAL,  THE  NEAREST  POWER  OF  2  SUCH  THAT  (MX.GE. Y-MAX) 
RETURN 
ENO 

$DATA 

JSTOP 

/* 


Fig.  30.  CONTINUED. 

(Loop  120).  After  each  iteration,  those  integrands  that  are  desired  as 
output  are  printed  provided  that  the  iteration  number  is  a  multiple  of 
COUNT  (the  specified  output  interval).  Loop  110  contains  all  calcula¬ 
tions  for  each  cycle.  The  computation  follows  the  steps  in  Section  A, 
again  with  the  exception  that  part  of  the  post-multiplication  operation 
occurs  before  the  generation  of  the  integral  increment  and  remainder 
(step  5).  The  program  has  been  executed  with  various  input  data  sets, 
and  it  performed  as  predicted  with  no  instabilities. 
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Chapter  IX 
CONCLUSION 

The  goal  of  this  work  has  been  the  development  of  a  machine  struc¬ 
ture  that  would  be  useful  for  the  numerical  solution  of  differential 
equations.  Of  the  various  requirements  spelled  oul,  in  Chapter  III,  the 
most  important  consideration  has  been  to  achieve  a  system  which  would 
be  easy  to  use  and  which  eventually  may  lead  to  an  integrated  GPC-DIC 
system,  such  that  the  DIC  would  constitute  simply  an  addition  to  the 
GPC  processor.  The  proposed  machine  has  been  simulated  on  a  general- 
purpose  computer  and  the  program  performed  satisfactorily. 

A  modular  computing  structure  has  been  introduced,  which  employs 
a  serial-parallel  processing  approach.  This  approach  maintains  some 
of  the  simplicity  of  communications  in  serial  machines  while,  at  the 
same  time,  setting  an  upper  limit  on  the  iteration  time  regardless  of 
the  complexity  or  size  of  the  problems  to  be  solved.  The  solution 
speed  of  problems  that  do  not  require  all  available  modules  may  be 
increased  by  dist**Jbuting  the  functions  evenly  over  all  modules. 

A  differential  equation  is  considered  tc  be  represented  by  a  num¬ 
ber  of  points  in  a  matrix,  where  each  point  designates  some  integral 
function  whose  output  must  be  communicated  to  other  points.  This  rep¬ 
resentation  leads  to  two  basic  methods  of  inter-module  communication: 
"vertical"  and  "horizontal"  communication.  Each  of  these  methods  has 
its  advantages;  however,  because  horizontal  communication  is  more  ap¬ 
propriate  for  a  wider  variety  of  problems,  it  has  been  chosen  for  the 
proposed  machine . 

To  eliminate  instabilities  and  oscillations  that  are  inherent  in 
circular  number  systems  and  can  occur  whenever  the  value  is  at  or  near 
the  maximum  or  minimum,  a  two-loop  number  system  was  introduced.  The 
system  has  separate  positive  and  negative  loops  returning  to  0  and  -1, 
respectively.  Extending  the  loops  such  that  they  overlap  but  are  not 
identicals  resul  in  a  number  system  with  a  hysteresis. 

The  propose  .  machine  contains  within  its  processor  both  pre-  and 
post-multiplical  jn  of  the  integral  increments.  It  has  been  shown 
that,  for  the  given  number  system,  the  outcome  of  the  integrating 
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cycle  can  be  predicted  sufficiently  to  allow  post-multiplication  to 
occur  simultaneously  with  integration.  The  integral  output  may  be 
multiplied  by  either  a  constant  or  a  variable.  Theoretically,  the 
concept  of  simultaneous  integration  and  post-multiplication  can  be 
extended  to  include  any  number  of  post-multiplication  factors  (both 
constants  and  variables). 

A  simplified  method  that  allows  floating-point  arithmetic  has 
been  introduced,  which  requires  the  storing  of  only  a  single  expo¬ 
nent  for  each  integral  cycle.  The  use  of  floating-point  arithmetic 
provides  dynamic  scaling  during  computations,  thereby  increasing  the 
accuracy  of  problem  solutions.  It  also  allows  the  scaling  routine 
contained  in  the  software  to  be  simplified  because  any  inequalities 
that  may  occur  in  the  scaling  equations  can  be  corrected  automati¬ 
cally  by  an  appropriate  change  of  the  respective  function  exponent. 
This  floating-point  method  does  permit  simultaneous  integration  and 
multiplication . 

Hardware  construction  and  testing  of  at  least  two  DIC  modules 
would  be  very  useful.  As  experimental  information  is  gathered,  the 
full  capabilities  of  the  system  can  be  determined  and  farther  im¬ 
provements  may  suggest  themselves.  Future  work  should  be  conducted 
to  study  the  feasibility  of  hardware  incorporation  of  a  unit  such  as 
the  DIC  into  general-purpose  computers  such  that  the  unit  would  ap¬ 
pear  as  another  processor.  Further  study  of  the  "vertical  communica¬ 
tion"  approach  is  warranted  because,  in  particular  applications,  this 
approach  may  lead  to  a  superior  system. 
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